Closed jannisborn closed 11 months ago
Still downloading but looks like this now:
>>> from paperscraper.get_dumps import biorxiv, medrxiv, chemrxiv
WARNING:paperscraper.load_dumps: No dump found for biorxiv. Skipping entry.
WARNING:paperscraper.load_dumps: No dump found for chemrxiv. Skipping entry.
WARNING:paperscraper.load_dumps: No dump found for medrxiv. Skipping entry.
WARNING:paperscraper.load_dumps: No dumps found for either biorxiv or medrxiv. Consider using paperscraper.get_dumps.* to fetch the dumps.
>>> medrxiv()
5101it [03:59, 22.87it/s]ERROR:paperscraper.xrxiv.xrxiv_api:Connection error: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')). Retrying (1/10)
26101it [24:57, 5.00it/s]
Close #34
When scraping biorxiv/medrxiv, occasional connection error occurs, as described in #34. With this PR we handle such errors more gracefully and attempt up to
max_retries
retries to download the same batch of papers.Version bump to 0.2.8