DataONEorg / scythe

Scythe, the data citation harvester
Other
7 stars 2 forks source link

support queries for URL-references to datasets #27

Open mbjones opened 3 years ago

mbjones commented 3 years ago

Many datasets are referenced by their URL in the text rather than Their DOI per se. I think that scythe should be able to detect these, even if they do some URL-escaping on the URL. Here's an example paper that cites an Arctic Data Center dataset using the following text in the Acknowledgements:

We thank Wendy Ermold for producing the inset map showing the regions comprising the three regions of the box model. Data utilized in the calculation of net precipitation minus evaporation rates are available online at http://rda.ucar.edu/datasets/ds627.0/. Salinity and stable oxygen isotope data used to estimate the meteoric water flux through Bering Strait are available at http://pacmars.eol.ucar.edu/dsaccess.html. Data utilized to estimate freshwater components (Pacific water, meteoric water, and sea‐ice meltwater) in the Lincoln Sea were downloaded from the NSF Arctic Data Center (https://arcticdata.io/catalog/#view/doi:10.18739/A2T02C).

jeanetteclark commented 3 years ago

Thanks for filing @mbjones - I agree that it would be good to support this. #13 isn't very specific, but the case you described is one of the cases I envisioned covering when I filed that issue.

As for this particular citation - I think the reason that it is not being picked up isn't because of the view URL. Although this journal is indexed by Scopus, Scopus only indexes the bibliographic information so none of this text in the acknowledgements is available to the search anyway. I'll file a separate issue to try and find a way to search the full text for AGU journals - there may be another source I'm not aware of yet.