Closed cnglen closed 6 years ago
Should be able to test/merge tonight.
I run into errors testing 22e24aa55cd626e246b650c14498f956a6db44de with the query string you provide; it seems the quote
argument isn't supported by urlencode()
. I also can't find mention of the parameter in documentation.
I'm running urllib
version 1.23 in both Python 2 and Python 3.
$ pip freeze | grep urllib
urllib3==1.23
Python 2.7.15:
>>> import arxiv
>>> arxiv.query(search_query="au:del_maestro+AND+ti:checkerboard")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "arxiv/arxiv.py", line 31, in query
"sortOrder": sort_order}, quote='+')
TypeError: urlencode() got an unexpected keyword argument 'quote'
Python 3.7.0:
>>> import arxiv
>>> arxiv.query(search_query="au:del_maestro+AND+ti:checkerboard")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/schwabl/Desktop/arxiv.py/arxiv/arxiv.py", line 31, in query
"sortOrder": sort_order}, quote='+')
TypeError: urlencode() got an unexpected keyword argument 'quote'
So, some follow-up questions:
$ python --version
$ pip freeze | grep urllib
or $ pip3 freeze | grep urllib
More fundamentally, there's a question of how query strings should function.
The default urllib
behavior, via quote_plus()
, is to convert each instance of ` to
+` in the query string. Your query is already possible in the existing library:
>>> import arxiv
>>> arxiv.query(search_query="au:del_maestro AND ti:checkerboard")
There are conceivably cases in which a user might want to first URL-encode and then modify their query string before passing it to arxiv.query()
. This could be accommodated by defining a wrapper for urllib.parse.quote(s, safe="+")
:
def query_with_plusses(string, safe="", encoding=None, errors=None):
return quote(string, safe=safe + "+", encoding=None, errors=None)
My concern is that this would produce issues for those expecting the standard urllib
behavior––for example, those who want to include an escaped +
in a part of their query––e.g. if they're searching for a paper entitled Odds+Ends
. This should not be misinterpreted as a URL-encoded space by the server, so the character should be quoted.
Additionally, Python2 urllib.urlencode() doesn't support the specification of an alternative to quote_plus()
. It'd take significant redundant code to implement this behavior both for Python 2 and Python 3.
Because the functionality in question is already available in arxiv 0.2.3
by using spaces in the query string, and because a working enhancement is likely to interfere with expected query behavior and/or impact Python 2 support, I'm going to close the PR.
If there's a cleaner solution I'm missing––perhaps by refactoring the query()
function so that the request and the URL-encoding are separate––feel free to push those changes. I'll gladly reopen and review! 😃
Thanks for your interest in arxiv
! Let me know if I can clarify anything.
maybe this is the best way(suggested by you):
arxiv.query(search_query="au:del_maestro AND ti:checkerboard")
Thanks.
enable + in query="au:del_maestro+AND+ti:checkerboard"