aio-libs / aiohttp

Asynchronous HTTP client/server framework for asyncio and Python
https://docs.aiohttp.org
Other
15.09k stars 2.01k forks source link

Client sends body after being redirected #6764

Closed luni3359 closed 1 month ago

luni3359 commented 2 years ago

Describe the bug

aiohttp keeps sending the body every time it's redirected.

To Reproduce

  1. Send body data via a GET request
  2. Site acknowledges data and redirects the client
  3. After redirection, aiohttp sends the body again

Expected behavior

aiohttp shouldn't send the body after being redirected. At the very least requests doesn't behave in this way and avoids sending the body a second time.

Logs/tracebacks

Minimum reproducible code:

import asyncio
import json
import aiohttp

headers = {'Content-Type': 'application/json'}
data = {
    "tags": "rating:general chibi",
    "random": True
}

async def main():
    async with aiohttp.ClientSession() as session:
        async with session.get("https://danbooru.donmai.us/posts.json", data=json.dumps(data), headers=headers) as response:
            response_payload = await response.read()
            print(response.status)
            print(response_payload)

asyncio.run(main())
Traceback (most recent call last):
  File "/home/luni3359/.local/share/pyenv/versions/3.10.4/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/luni3359/.local/share/pyenv/versions/3.10.4/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/luni3359/.vscode/extensions/ms-python.python-2022.6.2/pythonFiles/lib/python/debugpy/__main__.py", line 45, in <module>
    cli.main()
  File "/home/luni3359/.vscode/extensions/ms-python.python-2022.6.2/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 444, in main
    run()
  File "/home/luni3359/.vscode/extensions/ms-python.python-2022.6.2/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 285, in run_file
    runpy.run_path(target_as_str, run_name=compat.force_str("__main__"))
  File "/home/luni3359/.local/share/pyenv/versions/3.10.4/lib/python3.10/runpy.py", line 269, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/home/luni3359/.local/share/pyenv/versions/3.10.4/lib/python3.10/runpy.py", line 96, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/luni3359/.local/share/pyenv/versions/3.10.4/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/luni3359/Projects/Python/koa-bot/tests/booru_search_async.py", line 19, in <module>
    asyncio.run(main())
  File "/home/luni3359/.local/share/pyenv/versions/3.10.4/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/home/luni3359/.local/share/pyenv/versions/3.10.4/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete
    return future.result()
  File "/home/luni3359/Projects/Python/koa-bot/tests/booru_search_async.py", line 14, in main
    async with session.get("https://danbooru.donmai.us/posts.json", data=json.dumps(data), headers=headers) as response:
  File "/home/luni3359/.local/share/pyenv/versions/koa-bot/lib/python3.10/site-packages/aiohttp/client.py", line 1138, in __aenter__
    self._resp = await self._coro
  File "/home/luni3359/.local/share/pyenv/versions/koa-bot/lib/python3.10/site-packages/aiohttp/client.py", line 585, in _request
    raise TooManyRedirects(
aiohttp.client_exceptions.TooManyRedirects: 0, message='', url=URL('https://danbooru.donmai.us/posts.json')

Python Version

$ python --version
Python 3.10.4

aiohttp Version

$ python -m pip show aiohttp
Name: aiohttp
Version: 3.8.1
Summary: Async http client/server framework (asyncio)
Home-page: https://github.com/aio-libs/aiohttp
Author: 
Author-email: 
License: Apache 2
Location: /home/luni3359/.local/share/pyenv/versions/3.10.4/envs/koa-bot/lib/python3.10/site-packages
Requires: aiosignal, async-timeout, attrs, charset-normalizer, frozenlist, multidict, yarl
Required-by: asyncprawcore, discord.py, PixivPy-Async

multidict Version

$ python -m pip show multidict
Name: multidict
Version: 6.0.2
Summary: multidict implementation
Home-page: https://github.com/aio-libs/multidict
Author: Andrew Svetlov
Author-email: andrew.svetlov@gmail.com
License: Apache 2
Location: /home/luni3359/.local/share/pyenv/versions/3.10.4/envs/koa-bot/lib/python3.10/site-packages
Requires: 
Required-by: aiohttp, yarl

yarl Version

$ python -m pip show yarl
Name: yarl
Version: 1.7.2
Summary: Yet another URL library
Home-page: https://github.com/aio-libs/yarl/
Author: Andrew Svetlov
Author-email: andrew.svetlov@gmail.com
License: Apache 2
Location: /home/luni3359/.local/share/pyenv/versions/3.10.4/envs/koa-bot/lib/python3.10/site-packages
Requires: idna, multidict
Required-by: aiohttp, asyncprawcore

OS

             `..---+/---..`                 luni3359@souryuu
         `---.``   ``   `.---.`             ----------------
      .--.`        ``        `-:-.          OS: KDE neon 20.04 (User Edition) [x86_64]
    `:/:     `.----//----.`     :/-         Host: 81Y6 Lenovo Legion 5 15IMH05H
   .:.    `---`          `--.`    .:`       Kernel: 5.13.0-44-generic
  .:`   `--`                .:-    `:.      Uptime: 1 day, 13 hours, 56 mins
 `/    `:.      `.-::-.`      -:`   `/`     Packages: 2834 (dpkg), 15 (flatpak), 6 (snap)
 /.    /.     `:++++++++:`     .:    .:     Shell: bash 5.0.17
`/    .:     `+++++++++++/      /`   `+`    Resolution: 1920x1080 @ 120Hz
/+`   --     .++++++++++++`     :.   .+:    DE: KDE Plasma 5.24.5
`/    .:     `+++++++++++/      /`   `+`    WM: KWin (X11)
 /`    /.     `:++++++++:`     .:    .:     WM Theme: Breeze
 ./    `:.      `.:::-.`      -:`   `/`     Theme: Breeze (Dark) [Plasma], Breeze [GTK3]
  .:`   `--`                .:-    `:.      Icons: breeze-dark [Plasma], breeze-dark [GTK2/3/4]
   .:.    `---`          `--.`    .:`       Font: Noto Sans (10pt) [Plasma], Noto Sans (10pt) ]
    `:/:     `.----//----.`     :/-         Cursor: breeze (24px)
      .-:.`        ``        `-:-.          Terminal: konsole
         `---.``   ``   `.---.`             CPU: Intel Core i5-10300H (8) @ 2.5GHz
             `..---+/---..`                 GPU 1: Intel UHD Graphics
                                            GPU 2: Nvidia GeForce GTX 1660 Ti Mobile
                                            Memory: 11430MiB / 15870MiB (72%)

Related component

Client

Additional context

As far as I understand aiohttp shouldn't send the body after being redirected.

On danbooru/danbooru#5185 it was found out that aiohttp was behaving differently and was not being consistent with the behavior of other clients.

I'm not very well-versed in this topic so I apologize beforehand if I messed up the terminology but I can see the discrepancy. Please let me know if there's anything else that needs to be included in the issue.

Code of Conduct

Dreamsorcerer commented 2 years ago

I can't see any clear rules for this in the HTTP specifications.

I suspect that at the very least, a redirect that is meant to retain the same method (e.g. 307) should resend the payload. Maybe the codes which allow changing to GET should drop the payload (e.g 303). Which brings up the question of what redirect code they are using?

fredgido commented 2 years ago

its using 302 redirect. redirects are keeping cookies and auth tokens, you can see that is a huge security risk

MaximZemskov commented 2 years ago

The same issue actually with auth when redirecting to different origin. During the redirect, the authorization is reset in the https://github.com/aio-libs/aiohttp/blob/master/aiohttp/client.py#L601, but if the auth parameter is passed to the ClientSession, then the authorization will be reassigned in https://github.com/aio-libs/aiohttp/blob/master/aiohttp/client.py#L440

wqx-1214 commented 1 year ago

???????????

Dreamsorcerer commented 1 month ago

The current behaviour looks like it's correct. The body is removed from a POST request on 301/302/303. It is also removed from a GET request on 303. The RFC only says the content headers should be removed, but presumably it would make sense to remove the content as well at that point. It also only says when the method is changed to GET, so if we're already using GET it's not clear that it should be removed anyway.

https://www.rfc-editor.org/rfc/rfc9110.html#section-15.4-6.5.1

The comment about auth headers may have changed. I see code for removing auth headers currently when the origin changes.

MaximZemskov commented 1 month ago

Auth headers are reset in this line

if url.origin() != redirect_origin:
    auth = None
    headers.pop(hdrs.AUTHORIZATION, None)

But they assigned again in this line if the ClientSession was created with the auth parameter

if auth is None:
    auth = self._default_auth
Dreamsorcerer commented 1 month ago

Right, we should get a test to reproduce that.

Dreamsorcerer commented 1 month ago

Wait, but that's an auth in the ClientSession. If you've set auth for the entire session, why would you expect it to not be sent on a request?

MaximZemskov commented 1 month ago

I do expect it to be sent on request, but shouldn't it be removed in the case of a redirect to a different origin? If not, it might be worth adding a note about this in the documentation

Dreamsorcerer commented 1 month ago

My expectation is that any request to any origin will have the auth included, as it's global for the entire session. Feel free to make a PR with a doc change.