lukasschwab / arxiv.py

Python wrapper for the arXiv API
MIT License
1.07k stars 120 forks source link

max_results: float vs. int #123

Closed radoshi closed 11 months ago

radoshi commented 11 months ago

Hello! Thanks for the neat library.

The max_results parameter on the Search object makes it appear that you want a float (to deal with inf). However, passing in an actual float, results in an error.

Here it is, exercised as a test: https://gist.github.com/radoshi/a43862d438dc0d838d10b89ff0e0a564

I'm curious why you chose the float(inf) approach vs just hardcoding to max_results=300000 https://gist.github.com/radoshi/a9e3d0d9b947459c4ff96a46f6e35d55

Easy PR, happy to open it. Thanks again for the lib.

lukasschwab commented 11 months ago

You're right, it should be an integer. I think I was looking for some MAX_INT-ish constant and couldn't find one; ints are apparently unbounded in Python 3.

I'd gladly merge a PR!

The library still needs an unbounded default. Treating negative-max_results queries as unbounded queries is probably the most conventional.

arxiv.Search(
  query: str = "test",
  max_results: int = -1 # Equivalent to the old float("inf") behavior.
)

Good ticket — thanks for raising it, Lukas

radoshi commented 11 months ago

Cool. I'll get something to you. I think -1 will create a different issue, likely in this line: page_size = min(self.page_size, search.max_results - offset)

We can continue to the conversation on the PR.

radoshi commented 11 months ago

PR: https://github.com/lukasschwab/arxiv.py/pull/124 Ready for your review.

lukasschwab commented 11 months ago

Closed by #138.