python / cpython

The Python programming language
https://www.python.org
Other
63.15k stars 30.24k forks source link

docs: urllib.request.Request not accepting iterables data type #80245

Closed 97ea1439-d2fd-4aa2-a88a-d9d5b34e68d0 closed 5 years ago

97ea1439-d2fd-4aa2-a88a-d9d5b34e68d0 commented 5 years ago
BPO 36064
Nosy @orsenthil, @vadmium, @matrixise, @willingc, @tirkarthi
PRs
  • python/cpython#11976
  • python/cpython#11990
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = created_at = labels = ['easy', '3.8', 'type-feature', '3.7', 'docs'] title = 'docs: urllib.request.Request not accepting iterables data type' updated_at = user = 'https://bugs.python.org/sylye' ``` bugs.python.org fields: ```python activity = actor = 'mdk' assignee = 'docs@python' closed = True closed_date = closer = 'mdk' components = ['Documentation'] creation = creator = 'sylye' dependencies = [] files = [] hgrepos = [] issue_num = 36064 keywords = ['patch', 'easy'] message_count = 17.0 messages = ['336198', '336199', '336200', '336202', '336203', '336208', '336209', '336210', '336213', '336214', '336215', '336216', '336217', '336290', '339092', '339093', '339096'] nosy_count = 7.0 nosy_names = ['orsenthil', 'docs@python', 'martin.panter', 'matrixise', 'willingc', 'xtreak', 'sylye'] pr_nums = ['11976', '11990'] priority = 'normal' resolution = 'fixed' stage = 'resolved' status = 'closed' superseder = None type = 'enhancement' url = 'https://bugs.python.org/issue36064' versions = ['Python 3.6', 'Python 3.7', 'Python 3.8'] ```

    97ea1439-d2fd-4aa2-a88a-d9d5b34e68d0 commented 5 years ago

    I found out in the docs 3.6, in the class urllib.request.Request, for the input of 'data' data types, it says :

    "The supported object types include bytes, file-like objects, and iterables."

    But after testing it with data type dict for the 'data' input, I got error of:

    "can't concat str to bytes"

    It seems the docs should't say the 'data' data types support iterables.

    There more detail discussion is at : https://stackoverflow.com/questions/54802272/some-fundamental-concept-used-in-python-docs

    Hope this helps, thanks !

    matrixise commented 5 years ago

    Thank you for your issue but could you provide an example file?

    Have a nice day.

    For the new contributors, this issue is marked as easy.

    matrixise commented 5 years ago

    Hi Lye,

    I didn't read your link, sorry you don't need to provide an example. We can work with this example.

    matrixise commented 5 years ago

    Hi Lye,

    I am not sure it's a bug because I don't have any problem with your script.

    Here is your script.

    import urllib
    import urllib.request
    
    my_url = "https://api.foo.com"
    my_headers = { "Content-Type" : "application/x-www-form-urlencoded" }
    my_data = {
            "client_id" : "ppp",
            "client_secret" : "000",
            "grant_type" : "client_credentials" }
    
    req = urllib.request.Request(url=my_url, data=my_data, headers=my_headers)
    response = urllib.request.urlopen(req)
    html = response.read()
    print(html)

    and here is the result with the version 3.6 of Python:

    > docker run --rm -v $PWD:/src -it python:3.6 python /src/demo.py
    Traceback (most recent call last):
      File "/usr/local/lib/python3.6/urllib/request.py", line 1318, in do_open
        encode_chunked=req.has_header('Transfer-encoding'))
      ...
      File "/usr/local/lib/python3.6/socket.py", line 713, in create_connection
        sock.connect(sa)
    ConnectionRefusedError: [Errno 111] Connection refused
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/src/demo.py", line 12, in <module>
        response = urllib.request.urlopen(req)
      ...
      File "/usr/local/lib/python3.6/urllib/request.py", line 1320, in do_open
        raise URLError(err)
    urllib.error.URLError: <urlopen error [Errno 111] Connection refused>

    I don't get your error.

    and the version of this docker image is 3.6.8

    docker run --rm -v $PWD:/src -it python:3.6 python --version Python 3.6.8

    matrixise commented 5 years ago

    and I get the same result with the python:3.6.7 image docker

    > docker run --rm -v $PWD:/src -it python:3.6.7 python /src/demo.py
    Traceback (most recent call last):
      File "/usr/local/lib/python3.6/urllib/request.py", line 1318, in do_open
        encode_chunked=req.has_header('Transfer-encoding'))
      ...
      File "/usr/local/lib/python3.6/socket.py", line 713, in create_connection
        sock.connect(sa)
    ConnectionRefusedError: [Errno 111] Connection refused
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/src/demo.py", line 12, in <module>
        response = urllib.request.urlopen(req)
      ...
      File "/usr/local/lib/python3.6/urllib/request.py", line 1320, in do_open
        raise URLError(err)
    urllib.error.URLError: <urlopen error [Errno 111] Connection refused>

    I am really sorry but I can't reproduce your issue.

    What is your Linux distribution or OS?

    Thank you

    97ea1439-d2fd-4aa2-a88a-d9d5b34e68d0 commented 5 years ago

    Hi @matrixise,

    Sorry that the example in that stackoverflow site is a dummy API server. I tested my code in a real API server so I can produce the error. I have re-tested my code with that dummy site and yes you won't able to get that error. To reproduce that error, you may try a valid API server, like 'https://httpbin.org/post' . I just tested again by replacing :

    my_url = "https://api.foo.com"

    with

    my_url = "https://httpbin.org/post"

    and able to get the same error again. Apology for not using a real API server in my example :)

    97ea1439-d2fd-4aa2-a88a-d9d5b34e68d0 commented 5 years ago

    And I tested in both :

    Centos 6.10 4.14.77-70.59.amzn1.x86_64

    tirkarthi commented 5 years ago

    I guess key should be bytes instead of a dict with string key. Maybe the docs could be improved about the case where key in case of dict should be bytes.

    >>> from urllib.request import urlopen, Request
    >>> r = Request("http://httpbin.org/post", data={b'a': 'a'})
    >>> resp = urlopen(r)
    >>> print(resp.read())
    b'{\n  "args": {}, \n  "data": "", \n  "files": {}, \n  "form": {\n    "a": ""\n  }, \n  "headers": {\n    "Accept-Encoding": "identity", \n    "Content-Length": "1", \n    "Content-Type": "application/x-www-form-urlencoded", \n    "Host": "httpbin.org", \n    "User-Agent": "Python-urllib/3.8"\n  }, \n  "json": null, \n  "origin": "123.201.136.26, 123.201.136.26", \n  "url": "https://httpbin.org/post"\n}\n'

    See https://bugs.python.org/issue25439 for an open patch to ensure this is type checked.

    >>> from urllib.request import urlopen, Request
    >>> r = Request("http://httpbin.org/post", data={'a': 'a'})
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/urllib/request.py", line 332, in __init__
        self.data = data
      File "/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/urllib/request.py", line 370, in data
        raise TypeError("Key should be type of bytes in POST data")
    TypeError: Key should be type of bytes in POST data

    Current code is not checking the type makes it pass during construction and then throw an error later like below :

    >>> from urllib.request import urlopen, Request
    >>> r = Request("http://httpbin.org/post", data={'a': 'a'})
    >>> urlopen(r)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 222, in urlopen
        return opener.open(url, data, timeout)
      File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 525, in open
        response = self._open(req, data)
      File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 543, in _open
        '_open', req)
      File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 503, in _call_chain
        result = func(*args)
      File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 1345, in http_open
        return self.do_open(http.client.HTTPConnection, req)
      File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 1317, in do_open
        encode_chunked=req.has_header('Transfer-encoding'))
      File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1229, in request
        self._send_request(method, url, body, headers, encode_chunked)
      File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1275, in _send_request
        self.endheaders(body, encode_chunked=encode_chunked)
      File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1224, in endheaders
        self._send_output(message_body, encode_chunked=encode_chunked)
      File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1054, in _send_output
        + b'\r\n'
    TypeError: can't concat str to bytes
    matrixise commented 5 years ago

    ok, I confirm this issue, now, with httpbin.org I get the same issue.

    and it's just when we try to send the stream to the server.

    matrixise commented 5 years ago

    @xtreak

    ok because it's not related to the documentation, I have a fix for this issue but I am not sure about the title for this PR but what do you suggest?

    "Fix the encoding in _send_output when sending an iterable object via urllib.request.Request" ?

    matrixise commented 5 years ago

    @xtreak

    I have just pushed a PR, if you think it's an issue related to the documentation, you can ignore my PR and in this case I will close it.

    now, it's not related to the doc, could you review it and give me your feedback?

    Thank you,

    tirkarthi commented 5 years ago

    Stéphane, I don't know much about urllib to review this. The reason I said it's better document this is that there was an extended discussion on bpo-25439 along with a PR about type checking breaking some behavior and ambiguity in what is allowed currently. This PR seems to encode str to bytes implicitly and it needs to be documented as such that this str is encoded to bytes under the hood. I am not sure of backporting it as a fix to stable releases. I think it's better to raise an error instead of doing something implicitly. I am adding Martin and Senthil to the issue who might have a better context.

    matrixise commented 5 years ago

    Thank you @xtreak for adding Martin and Senthil.

    vadmium commented 5 years ago

    I agree the documentation is insufficient. It should have said if “data” is iterated, it must yield bytes-like objects.

    I agree it is unwise to yet more special cases for the uploaded data types. When Lye passed the dictionary of three keys and values, I suspect they weren’t expecting it to upload just the key names using one HTTP chunk for each key. See bpo-23740 about the mess of data types currently supported.

    willingc commented 5 years ago

    New changeset 9e30fbac019d9c0a31d444a711e845c7e883caf5 by Carol Willing (Julien Palard) in branch 'master': bpo-36064: Clarify allowed data types for urllib.request.Request. (GH-11990) https://github.com/python/cpython/commit/9e30fbac019d9c0a31d444a711e845c7e883caf5

    willingc commented 5 years ago

    I've merged Julien Palard's doc PR. Is this sufficient to close this issue?

    tirkarthi commented 5 years ago

    I would propose closing it since docs were made improved and if type checking needs to be enforced then it can be discussed in the linked open issues that have patches.