Open rafguns opened 1 year ago
OTOH, we do store fields error
and status_code
in table doi_fulltext
. In vabb14-preliminary, no errors are registered but in vabb13-preliminary there are 143, according to query
select *
from doi_fulltext
where error is not NULL
These include HTTP errors (401, 403, 429...) as well as, e.g., "Time out, URL or connection error" or "SSL error".
But I think this issue is still about something else: we go through all the steps but in the end we cannot access a full-text document without encountering a technical issue. Seems very useful to also register that. But shouldn't we register all errors in the same doi_error
table as well then? And if so, do we need to rethink the LookupResult structure?
Correction to previous comment: I think the current structure ensures that we will basically never get errors in the database. In retrieve_fulltexts
we go through a number of steps to try and find a full-text but if one fails,w e just proceed to the next one. If all fail, we return None
and nothing gets registered. I think previously we saved the result of each step to the database(?), which explains why vabb13-preliminary.db does contain error results.
At the moment, DOIs for which we don't find a result are not stored in the database. This has two downsides: