openjournals / joss

The Journal of Open Source Software
https://joss.theoj.org
MIT License
1.5k stars 183 forks source link

Website suggestion: Allow search to include main author country/region #1359

Open anelda opened 3 weeks ago

anelda commented 3 weeks ago

Good day,

I am exploring publications by African RSEs and realised there isn't a straightforward way to find papers in JOSS by African authors. Is it possible to add the option to the search algorithm to find papers based on (at least) the first author's country? I realise there are large numbers of African authors affiliated with non-African institutions, but for now, I really would like to know if any African-based RSEs have published/submitted to JOSS and similar journals.

I would be happy to discuss this further!

Thanks for your consideration.

Kind regards,

Anelda

sneakers-the-rat commented 3 weeks ago

questions from this - do we collect this data, should we? implementation questions too - is this a place to hook into openalex rather than rolling our own if we decide to pursue this?

danielskatz commented 3 weeks ago

We don't collect this data. A few of us have tried to manually collect data for one year via the author affiliations, but even this was difficult.

If we wanted to do a better job of standardizing and collecting author affiliations, we could try to use ROR IDs in the same way we use ORCIDs for authors. There would be challenges in getting authors used to this, but we could probably succeed. My guess is most journals will eventually do this, but that's likely years away.

On the other hand, affiliations are not the same as locations, given remote employees. We don't now ask for where people are located, and I don't think we have a reason to do so.

A (difficult) way to gather some of this date, since we are trying to make sure people add countries to the affiliations (though this is relatively recent and many older papers won't have this), and because the author affiliation data in the paper is not in the metadata we submit to Crossref, would be potentially to download the PDFs of JOSS papers, use grobid to find the author information, parse it, and look for countries in Africa.

sneakers-the-rat commented 3 weeks ago

Thanks Daniel.

would be potentially to download the PDFs of JOSS papers, use grobid to find the author information, parse it, and look for countries in Africa.

Or grab the JATS from joss-papers which has ORCID and affiliation as XML if you dont need all the grobid output, right? (With the same caveats you raise that institution !== location, but ORCID does allow people to report location in their profile, so that would probably be the best you could do I think?)

I think doing a third/fourth order query like (paper => institution => country => continent) is probably a bit out of scope for the website, but I wonder if we could make an "advanced search" that would let you query specific fields like institution?

danielskatz commented 3 weeks ago

Good point re JATS.

re affiliations, they're not standardized which is why I think ROR IDs would be good to move to one day.

anelda commented 3 weeks ago

@danielskatz @sneakers-the-rat, thanks for your speedy responses and highlighting the complexity of implementing my request.

I assumed that the author affiliation and associated country is collected because it appears on the paper :smile:

Thanks for pointing me to openalex. While the results are not so easy to work with, it did give me a great solution for some other challenges we were working on (i.e. finding African RSE-related papers to feature in the RSSE Africa newsletter).

Thanks!