lukasschwab / arxiv.py

Python wrapper for the arXiv API
MIT License
1.11k stars 123 forks source link

what's the normal time for downloading a paper? #41

Closed shizhediao closed 4 years ago

shizhediao commented 4 years ago

Hi, Thanks for your great work! I was wondering what's the normal time for downloading a paper? I would like to download as much as possible papers to do some research. Maybe the size is 10 K ~ 100 K. But for now, it costs me 10 seconds for each paper downloading, so is it possible to speed up? Thanks very much!

lukasschwab commented 4 years ago

Hmm, this depends on how you're finding the papers to download.

Probably the most useful: if you just want as many papers as possible, arXiv offers bulk access to tarfiles of PDFs and source files via S3: https://arxiv.org/help/bulk_data_s3

I'll close this issue for the time being; feel free to reopen it if this doesn't answer your question!

shizhediao commented 4 years ago

Thanks for your reply! In my experiment, it costs 10-second only for the call to arxiv.download. Thanks for pointing out the bulk access, I'll take a look. Thanks very much!