Closed Santosh-Gupta closed 5 years ago
I made all the changes except writing tests for the start parameter. I took a look at this
https://github.com/Santosh-Gupta/arxiv.py/blob/master/tests/test_search.py
and I am guessing I just add the start parameter where ever there is a max_result parameter?
But I am not sure how to handle lines 71-73
for k, v in parse_qsl(url.split("?")[1]):
if k == "max_results":
max_result = int(v)
I added added time_sleep as a parameter for query, because something it skips results, so I am guessing time_sleep = 3 may be too soon. I am experimenting with time_sleep = 5.
Edit:
Even raising sleep time to 10 brings in more results. I am experimenting with a high volume of results though, 80,000 ish
Edit:
It looks like the api results are just inconsistent. I'm not sure if time sleep has an effect. In think the only secure way is to run the query a few times, switching between descending and ascending, appending values if they do not already exist.
Thanks for the extra work here! I'm going to do some work here––incl. reverting the changes to time_sleep
logic, seeing as you concluded it doesn't make a consistent difference––then merge and roll a new release.
Cheers!
added variable
start
Description
Breaking changes
List any changes that break the API usage supported on
master
.Relevant issues
List GitHub issues relevant to this change.
Checklist
python setupy.py test
.README.md
.