ropensci-archive / pubchunks

:warning: ARCHIVED :warning: Get chunks of XML format scholarly articles
Other
8 stars 0 forks source link

mdpi problems #7

Closed sckott closed 4 years ago

sckott commented 5 years ago

aff node

is not coming out right with our current setup. first, in the <aff> nodes the text is only broken out by emails with <email> tags, but the address is just free text within the <aff> tag

right now we're ending up with

                      id label                                                                            email given_names    surname
1 af1-nutrients-03-00049     1 hurni473@student.otago.ac.nz, jody.miller@otago.ac.nz, meredith.rose@otago.ac.nz   Nicola A. Hursthouse
2 af1-nutrients-03-00049     1 hurni473@student.otago.ac.nz, jody.miller@otago.ac.nz, meredith.rose@otago.ac.nz     Jody C.     Miller
3 af1-nutrients-03-00049     1 hurni473@student.otago.ac.nz, jody.miller@otago.ac.nz, meredith.rose@otago.ac.nz Meredith C.       Rose
4 af1-nutrients-03-00049     1 hurni473@student.otago.ac.nz, jody.miller@otago.ac.nz, meredith.rose@otago.ac.nz     Lisa A.   Houghton
5 af2-nutrients-03-00049     2                                                          andrew.gray@otago.ac.nz   Andrew R.       Gray

with many emails for each person

sckott commented 4 years ago

that's just a problem in how affiliations are done - with many emails assigned to an institution, so no way to automate assigning emails to people.