neuroquery / pubget

Collecting papers from PubMed Central and extracting text, metadata and stereotactic coordinates.
https://neuroquery.github.io/pubget/
MIT License
20 stars 12 forks source link

extract author affiliations into authors.csv #37

Closed koudyk closed 1 year ago

koudyk commented 1 year ago

I just realized this doesn't work for all articles. I think different journals list the affiliations in different ways, and this is reflected in differences in the xmls

jeromedockes commented 1 year ago

I'm not surprised that affiliations are handled in rather diverse ways -- the archiving tag suite is meant to be very flexible so there is always a bit of variability! We'll need to look at some examples, maybe a few patterns cover the majority of articles.

Also,

I see you closed the PR; we can list the different ways we see affiliations encoded in different articles in an issue and eventually open another PR