Open cameronneylon opened 10 years ago
An additional issue for very large sets of queries is that the scraper appears to overload the OAG service resulting in a null return raising a ValueError which isn't caught.
Seems easy enough to solve this by backing off and handling the failures gracefully but worth considering whether a solution to this issue can be tied to handling the polling so as to populate the full set of responses.
I've got code that does this sort of now. Will send pull request in a bit. Policy/UI questions before that though.
The Open Article Gauge updates the results for a given POST request asynchronously. The immediately returned JSON will only include objects for which a license is already cached by OAG. The delay for obtaining license information for a large set of previously unseen DOIs can be substantial (minutes to hours).
Not sure how this should be managed in practice but some sort of delay or polling until the full set returns may be necessary. When fully populated the returned JSON object should include some information for every DOI. It may be effective to suggest to user to re-run the data gathering after a delay until full set is returned.