Planteome / samara

extracts plant trait data from open data sources like apsnet and ars-grin
MIT License
5 stars 3 forks source link

apsnet common disease pages no longer available via http://www.apsnet.org/publications/commonnames/Pages/default.aspx #57

Closed jhpoelen closed 5 years ago

jhpoelen commented 5 years ago

Samara supports extracting of plant diseases from http://www.apsnet.org/publications/commonnames/Pages/default.aspx .

After checking logs, it appears that the resources are no longer available via apsnet.org. Thanks to https://build.berkeleybop.org/view/Planteome/job/extract-apsnet-diseases , I was able to see that the pages were last seen on 11 Oct 2018 . The last available scrape has been included in the repository at https://github.com/globalbioticinteractions/aps .

from https://build.berkeleybop.org/view/Planteome/job/extract-apsnet-diseases/168/console :

[http://www.apsnet.org/publications/commonnames/Pages/default.aspx] download failed because of:
org.jsoup.HttpStatusException: HTTP error fetching URL. Status=404, URL=https://www.apsnet.org/publications/commonnames/Pages/default.aspx
jhpoelen commented 5 years ago

@cmungall @marieALaporte @jaiswalp - please note that APS's common names for disease pages were last available on 2018-10-11 . I was unable to find the new location of the pages if there is any. Please holler if you know more about this.

marieALaporte commented 5 years ago

This is the new link. They just changed the website apparently. @jhpoelen, are you trying to update the data?

jhpoelen commented 5 years ago

Turns out that the pages were moved to https://www.apsnet.org/edcenter/resources/commonnames/Pages/default.aspx . . . updating .

jhpoelen commented 5 years ago

Thanks @marieALaporte .

jhpoelen commented 5 years ago

@marieALaporte the job at https://build.berkeleybop.org/view/Planteome/job/extract-apsnet-diseases has been updating the data on a regular interval. Until today, GloBI was harvesting the most recent aps scrape.

jhpoelen commented 5 years ago

Updated samara to use new endpoint. Please note that workarounds for new issues #58 and #59 were implemented. Ideally, we'd convince the American Phytopathological Society to publish their common names for diseases in a more structured format.