neuroquery / pubget

Collecting papers from PubMed Central and extracting text, metadata and stereotactic coordinates.
https://neuroquery.github.io/pubget/
MIT License
20 stars 12 forks source link

missing links #33

Closed adelavega closed 1 year ago

adelavega commented 1 year ago

It seems that certain links are missed from extraction. They are removed from the text, but not but into links.csv

Example: PMCID: 7856411

Text:

Data from this study are publicly available at NeuroVault at NeuroVault (<uri xlink:href="https://neurovault.org/collections/1866/">https://neurovault.org/collections/1866/</uri>).

The link does not appear in links.csv, but is removed from the text. Arguably the text in between the <urI> tags should not be removed either.