django / daphne

Django Channels HTTP/WebSocket server
BSD 3-Clause "New" or "Revised" License
2.37k stars 266 forks source link

Requests with Transfer-Encoding: chunked have no content #476

Closed tfeldmann closed 6 months ago

tfeldmann commented 1 year ago

I guess I'm a bit confused by now. According to the ASGI spec the server should handle (dechunk?) incoming requests.

So I've written a small django app with this view:

import logging

from django.http import HttpResponse
from django.views.decorators.csrf import csrf_exempt

@csrf_exempt
def index(request):
    print("META", request.META)
    print("GET", request.GET)
    print("POST", request.POST)
    print("FILES", request.FILES)
    if request.method == "POST":
        logging.info(request.__dict__)
        return HttpResponse("END_OK\n")
    else:
        return HttpResponse("ok")

Starting daphne with daphne myapp.asgi:application -p 8080 and execute a curl request with

curl \
    -H "User-Agent: MyAgent ()" \
    -H "Transfer-Encoding: chunked" \
    -F  "file=@/Users/tf/Pictures/gravatar.jpeg" \
    http://localhost:8080/sync

Results in:

META {'REQUEST_METHOD': 'POST', 'QUERY_STRING': '', 'SCRIPT_NAME': '', 'PATH_INFO': '/sync/', 'wsgi.multithread': True, 'wsgi.multiprocess': True, 'REMOTE_ADDR': '127.0.0.1', 'REMOTE_HOST': '127.0.0.1', 'REMOTE_PORT': 65074, 'SERVER_NAME': '127.0.0.1', 'SERVER_PORT': '8080', 'HTTP_HOST': 'localhost:8080', 'HTTP_ACCEPT': '*/*', 'HTTP_USER_AGENT': 'MyAgent ()', 'HTTP_TRANSFER_ENCODING': 'chunked', 'CONTENT_TYPE': 'multipart/form-data; boundary=------------------------410f03697d6bfef5'}
GET <QueryDict: {}>
POST <QueryDict: {}>
FILES <MultiValueDict: {}>

Doing the same in waitress (which dechunks the request) results in:

META {'REMOTE_ADDR': '127.0.0.1', 'REMOTE_HOST': '127.0.0.1', 'REMOTE_PORT': '65077', 'REQUEST_METHOD': 'POST', 'SERVER_PORT': '8080', 'SERVER_NAME': 'waitress.invalid', 'SERVER_SOFTWARE': 'waitress', 'SERVER_PROTOCOL': 'HTTP/1.1', 'SCRIPT_NAME': '', 'PATH_INFO': '/sync/', 'REQUEST_URI': '/sync/', 'QUERY_STRING': '', 'wsgi.url_scheme': 'http', 'wsgi.version': (1, 0), 'wsgi.errors': <_io.TextIOWrapper name='<stderr>' mode='w' encoding='utf-8'>, 'wsgi.multithread': True, 'wsgi.multiprocess': False, 'wsgi.run_once': False, 'wsgi.input': <_io.BufferedRandom name=9>, 'wsgi.file_wrapper': <class 'waitress.buffers.ReadOnlyFileBasedBuffer'>, 'wsgi.input_terminated': True, 'HTTP_HOST': 'localhost:8080', 'HTTP_ACCEPT': '*/*', 'HTTP_USER_AGENT': 'MyAgent ()', 'CONTENT_TYPE': 'multipart/form-data; boundary=------------------------428cf6133f9ea7bf', 'CONTENT_LENGTH': '573892', 'waitress.client_disconnected': <bound method HTTPChannel.check_client_disconnected of <waitress.channel.HTTPChannel connected 127.0.0.1:65077 at 0x10448e7d0>>}
GET <QueryDict: {}>
POST <QueryDict: {}>
FILES <MultiValueDict: {'file': [<InMemoryUploadedFile: gravatar.jpeg (image/jpeg)>]}>

What am I doing wrong here?

carltongibson commented 1 year ago

Can you check that the request.body is actually set correctly (prior to file parsing)?

tfeldmann commented 1 year ago

The data seems to be there:

[snip]
80\x13\xebF})~\xb4c\xd2\x80\x13\x14R\xd1\x8a\x00J^\x94c\x14f\x80\n(4\x80f\x9a\x01zR\n\\R\x9e(`\'8\xa3\xa5\x19\x14g=)\x00u\xe6\x8a\x08\xa5\x03\x8a\x06\'Z3\x8a\\\x1aLP \x1e\xd4\xb4\x9c\xf6\xa5\x06\x80\x17\xb5\'j3\x9aBx\xa6\x008\xa7b\x93\xb5\x14\x00w\xa5\xa4\x07 \xd2\x0e\xb4\x00\xeaNh=h#\x02\x80\x0cP(\xcd-\x00\x06\x82y\xc7j:\x8a8\x1cR\x01N\x14\xf1J\x18\xf2\xa7\x86\xf5\xa6\xf4\xe0\xd3\xb2pT\x8c\x91\xd2\x9d\x81nt\x1e\x0b\xd4\xce\x9d\xe2%\x8d\xe5"\xde\xe38\x18\xe9\xb5\x1b\xfa\x9a\xf5/\x11iq\xea\xb6&<\r\xe0u\xf4\xe4\x7f\x85x[\t\x17\x12\xc41*\xfd\xcfl\xf5\xaf^\xd05\xe8\xf5\x9d1.m\xdf\x95\xcf\x980y\xe4\x81\xd4\x0fJ\xf1q\xf4\xb5\xb9\xe8ag\xd0\xa5\xa7_\x1b\x9d=\xac\xde .\xa0\xc9\xeb\xd0\x13\xc5d\xeb\x88\xba\x85\xe5\xbd\xee\x9f)\x8bP\xb3`\xfc\x0f\xbf\x81\x802x\x15\xd1\xae\x9be\x1e\xbc\xf7\xc9\xf2\xcd0Ue\xe7\x9c\x0e9\xaeE\xed\xae\xf4\xddI\xec\xef\xe2\xcd\xac\xee~\xcf.\xe1\xc3\x93\x93\xc0\xe7\xa7\xadyMX\xed=\x03\xc3\xfa\xe7\xf6\xce\x8d) \xad\xdcjQ\xfe\xa0\x01\x9c\xfdj\xdc\x06\xe0\xd8\xc81\xc8c\x93\x9fj\xe6,u\xe84i\xa2\xb5\x96?\xde\xb9\x00\xbeO\xdc>\xd8>\x95\xbb\x14\x92\xcb/\xda\x04\x98\x89\x86Bc\xa8\xeb\x9a\x00\xd6\xb4\x9a6\x84D\xed\xf3\xe3\x15$g2\x18\\\xe4\x8eT\xfb\n\xca\xb7\x88\xad\xe9\x0f\xf7_\xe7\r\xeeOJ\xd4h\xce\xe5 r\x07Z\x10\x11\xba\'\x9f\x91\xc8\xe8k+Y\xb7\x8fc\x97\xfe\x1e\x00\xabry\xd1\xcc\xcc\x8b\x90\xc7\'\x91\\\xf7\x88/\xcc\xb3\xf3\xc4\x91d\x11X\xd6z\r\x18\x10\x1b\x84\x9euq\x98\xc9\x18\x19\xe9\xc5f\xf8\x8bG\xb9\x98Y\xea\x16\n<\xb87\xee\x87 \x06\xce\x00\xe4\xf3\xeakCM\x9c\xdf]<\x8a\xff\x007q\x8fj}\xfd\xec\xb2\xdc\x08\xd7\xf7M\x17\xde=s\x901\\\x91\xa9fKe\xff\x00\x0fx\x82\xdf\xec\xed\x04\xfb`\xfcs\xdc\xd7_\xe7\x19\xec\xf6B\xc3y\xfb\xa7=k\xc85}\x12\xc2\xf8\x1dF\xee0\xf2\xa71\xf2F\x0f\x00\xf45\xb1\xe1\x1d~+mEm\xa6\x18\x8ccks\xc7\xe0\x05v\xc2w@\x99\xd3\xde\xdf$\x81\xc3\x9c\xb2\x0e\x1b\xde\xb9}\x0c\xb5\xe7\x89g\x177,\xb2*e#\xc6r7\x0cW]\xa9\xdaY]fkf\xf3\x1dF\xe1\xc1\x18?\x8dq:\xd5\xacv:\xf5\x8e\xb5\x07\xfc}o\x8e\x17?\xec\x82[\x1e\x9d}\xaa\x8a:\x06\xb4:}\xcc\xb2Fy\x91\x8eS\xd7\'\xaf\xe9\\\xef\xc4/\x0bY\xae\x9bk\xac\xc6\x82+\x96\x08\x19\x80\xceA\xdc\xc7\xbf\xadt\x1f\xda\xd6\xd2\xdd\xc7p\xdf\xeb\xd5\x03\x15\xe7\x9c\x1c\xf5\xc7\xad_\xbfko\x18i\x12i\xb3p\xe5O\x969;[i\x03\xd3\xfb\xd40<\x12x\xcb\x81\xdc\x1e(\xd14\xa8.|Scow\x83l~f\xcfv\x07\x8a\xbd\xab\xe9w^\x1e\xd5[K\xbe\x18\x913\xe5r>t\x07\x1b\xb8\'\x1fJ\xabmgs\xa8j\t\x1e\x9e\xdbn\xd4\xee\x07\x03\xa0\xe4\xf5\xe2\xa9\x03=\xeb\xc46%\xefl\xed\x84a\xed\xc2\xbf\xcd\x9e\x9d;S U\xd3\xd5,\xd8ey\xda\x7f_\xebS\xe9\xd2\xdd\x9f\x0eZ\x1dAq8_\x9cd\x1c\x9d\xde\xdcU\xdb\xdbU\xbc\x9e)c9\x91s\x91\x8aL\x0bQ\xaf\xda\x95\xac\xaf\x14:?\\\x9f\xc7\xb5r\t\x15\xa6\x9d\xa8\xfd\x9e8\x81T9\x1d{\xd7U\x0c\xd1Y\xdf\xc6\x93I\xf3>v\x8cz\x0fj\xe65}\xf6\x97F|\xee\x88r\x07LR\x03\x97\xb8\xd4\xc6\x8f\xe2{\x8b\x99\x97jlR@\xe7\x8a\xab\xadxb\x1f\x11\xea\t\xaaiZ\x81|\xaa\xf9\x90\xec\x03\x1d\xcfS\xef]\x04\xba\'\xf6\xdcr\xcf,\x9f1\\!\xc7C\xf9\x8a\xe3c\xd2\x0f\x83uY/t\xf1\xfb\xc23 \xe9\xb8g8\xe4\x9cr(@v\xb6\xab.\x81\xa4\x8b;\xc8D\xd6s.\xd7r~\xeeF\x0f\x03\xd8V\x07\x89\xbe\x1cZ\xddi?\xdb\x1e\x19\x85RR|\xc9Pq\xbcm,\xd9,{\xfc\xb5\xdb\xad\xf5\xbd\xff\x00\x86\xe17\xdf\xb97x\x8bo-\x9d\xcb\xed\xf55\xaf\xa7\xd8\xff\x00a\xe9\x91\xac\x071\xb6\x1cv\xca\x908\xef\xe9GP>\x7f\xf0\xbe\x8d&\xb5\xae\x0bi\xe0\x11\xcdg&\xf6\x04\xfd\xd6B:z\xf5\xeb^\xfb\xa14n\x8c$\xc1\x96!\xe5\xef?Jm\xc6\x83\xa7Ox5\x95Q\r\xdbDQ\xba\xb6A9=\xf1\xd7\xda\x9fia\x17\x94D\\I\xb81>\xb4\x9b\x03\x92\xd6b\xb8\xd1\xb5{\xab\xcb\xeb\x87kr\xfb\xa3;A\xda=0*\xdf\x85|S\xa3x\xc2\xe6\xe7\xec\x8e]\xe2#\x01\x91\x97\x19\x04\xf7\xc7\xa5tz\x940j\xd6\xefo(\xfd\xecJ@\xeb\xcfz\xf9\xdb\xed3\xe9\x1e%\xbb\xb8\xd3\x8f\xd9\xee\xa2a\x920\xd9\xca\xfb\xe4t4\xc6{}\xb5\xb4\xb7\xb7q\xdd]\xc4<\xf83\x8c\x9e\x9b\x86?\x95]\x8f\xc5\x02\xe9"{v\x0eA;\xc6zzv\xac?\x08\xf8\xc6/\x1cX\\\xda\xca\x9fe\xd5\xa2\xd9\xbe0K\xe3%\x88\xe7\x01~\xea\xfe\xb5\xc6\xfc0\xb5\x9fL\xf1U\xde\x9f1\xf3%\xb7\xd8\\p>\xf2\xb1\x1d>\xb4\xac\x07W\xad\xe9z\xe4\x1a\xfa_\xd9\xb9hF\x0f\xdeQ\xbb\x8e\x9e\xd5\xb3\xaex~\xf6\xfa(\xaf\xf4\xe9\xda\xd2\xf1\xd0,\x85\x009\xc0\xf7\xf7\xae\x87U\xb9\xb7\xb11$\xe7\xe5\x90\xe3\xa1\xf4\xf6\xab\x89*\xc9j\x12\x17\xc1a\x858\xa5q\x1eq\x0e\x93\xab\xeaZ\x1d\xd43\'\x97\xa9[\x97x\xe7\x0e\t8\x18S\xe89\xe6\xaex6\xe3X\xbb\xd3\xaf\x7f\xb4\xe0X\xcd\x98\x922\xfe`m\xfb\x02\xf3\xc7L\xf3]\x9c\xd6\xb7\xa9\xa6\\\x81),\xc8\xc3v\x07\x15\xe5\x9e\x19\xf1u\xe7\xf6\x9e\xb3\xe1mI\x7fy+L\xd0\xc9\x91\xca\xb3\x04^\x00\xfa\xf54\xd1Kc\xb8\xbb\xbc\x9eM\x1f\xcc\xb3\xe6U\xe1[8\xe7o\x0b\xfc\xaa\x9e\x91\xe2\x9dm\x1e;mF\xc9c\x0cB\x89\x04\xa1\xb3\xdb\xb0\xad\x9d\x0e\xc2;\x0f\x0f\xca\x810\xdb\xcb7=N\xd1\xcf\xe9C\xea\x16s[\xad\xabI\x99q\x95]\xa7\xad!\x1ar\xdf\xed\xc4\x0c\xbf9\xe4\x9a\xbfk2y@\x16\xc3w\xf7\xae2\x0b\x8dU.\x9e\t\xa3\xca %\x1br\xf4\x15r\xceg\xd4\x03\xcd\x14\x9b\'\x88\x81 \xc6s\x9f\xd3\xa500~/\xef\x1at-\xe5\xe5W89\xe9\xf3%a|.\xbc\xdf\xe2\rB\xdc\x9e\x07\x97\x9f\x7f\x95\xcdz~\xa1\xa7\xdax\x9bBkK\x98\xc3\xc4\xf8\xca\xe4\xf6l\xfbzW\x86j\xd6:\x9f\xc3\xaf\x10\xc2m\xdc\xc8\xef\x9d\xa7\xe5\x1ef\x10g\xd7\x18\xdfR\xc0\xf5k\xd0\x03\x810\xcf<)\xac/\x11\x08\xed\xf4\xdf:\x11\xe5N\x999Q\x9c\xd6\xad\xac\xb2x\xaa\xca\rF\tp\x0ew.:c\x8e\xf8\xf4\xaa7\xf7p\xdf\x7f\xa0"\xf9\x85:\xf5\x14!\x95\xb5\x1bE\xd5tk;\x99d\xd9"\xb7Lg8\x15\x86o\xd6X\xc5\xba\xb6\xd9 |\xabz\x91\xc0\xad\xaf\x11\xdb2Co-\xb2s\xc2\xb7?w\x03\xaf=k\x94\xd3\xec\x1e]L\x8d\xbb9\xdc\xf2g<g\x9e*\x90\x99\xd3\\^\xdb\xddi\xd1HX\xbf\x92\x00\x9c\x10GA\xf3\x7f\x91[\xbal\xbal\xdam\xb1R\x04D\xa1U\xc1\xe0\xe0b\xb8\x1b\xab\xebu\xbf0\xdaK\x80\x8d\xb2w\xdaz\x03\x86\xe0\xfe\x1d*\xde\x97qcg\xab\xc6#\x93ts\x81\x93\xb4\x8f\x98\x91M\x81\xd9\xea\xe2/9\n\x13\xb4q\xd3\xb6jY\xe4\xb7\xb5\xd2\x84\x85\x89\xe4\x05\xe3\xa5>YQJG\x8d\xca\x17 \xf4\xa4\xbd\xb7\x82\xfbHt\x07o\xcc\t\xe0\x9c\x1aB(K\xac\xc9\xa6\xc1\xe7\x91\xe6F\xfc\x93\xe9\xda\x96\x03o\xaf[-\xe8\x18\x8d\xbe\xeb\xe0\xe5y\xc7O\xc2\xa0\xb6r\x11m%\xf9\xa3#\xe5\xfc+Z\xda\x01i\x181\xfc\xb1\xf7\x1e\x94\x01\xff\xd9\r\n--------------------------b40de7729abcfbb1--\r\n'
META {'REQUEST_METHOD': 'POST', 'QUERY_STRING': '', 'SCRIPT_NAME': '', 'PATH_INFO': '/sync/', 'wsgi.multithread': True, 'wsgi.multiprocess': True, 'REMOTE_ADDR': '127.0.0.1', 'REMOTE_HOST': '127.0.0.1', 'REMOTE_PORT': 49204, 'SERVER_NAME': '127.0.0.1', 'SERVER_PORT': '8080', 'HTTP_HOST': 'localhost:8080', 'HTTP_ACCEPT': '*/*', 'HTTP_USER_AGENT': 'SmartLink ()', 'HTTP_TRANSFER_ENCODING': 'chunked', 'CONTENT_TYPE': 'multipart/form-data; boundary=------------------------b40de7729abcfbb1'}
GET <QueryDict: {}>
POST <QueryDict: {}>
FILES <MultiValueDict: {}>

Added this line:

def index(request):
    print("body", request.body)
carltongibson commented 1 year ago

OK, so there's some weirdness as to why it's not getting out of Django's HttpRequest._load_post_and_files() then (I think)

tfeldmann commented 1 year ago

Yes, probably. I tried it with something more managable:

body b'--------------------------2c8edae761a97569\r\nContent-Disposition: form-data; name="file"; filename="Hello.txt"\r\nContent-Type: text/plain\r\n\r\nHello World!\n\r\n--------------------------2c8edae761a97569--\r\n'
META {'REQUEST_METHOD': 'POST', 'QUERY_STRING': '', 'SCRIPT_NAME': '', 'PATH_INFO': '/sync/', 'wsgi.multithread': True, 'wsgi.multiprocess': True, 'REMOTE_ADDR': '127.0.0.1', 'REMOTE_HOST': '127.0.0.1', 'REMOTE_PORT': 49220, 'SERVER_NAME': '127.0.0.1', 'SERVER_PORT': '8080', 'HTTP_HOST': 'localhost:8080', 'HTTP_ACCEPT': '*/*', 'HTTP_USER_AGENT': 'SmartLink ()', 'HTTP_TRANSFER_ENCODING': 'chunked', 'CONTENT_TYPE': 'multipart/form-data; boundary=------------------------2c8edae761a97569'}
GET <QueryDict: {}>
POST <QueryDict: {}>
FILES <MultiValueDict: {}>

I'm on Django 4.2.1 and Daphne 4.0.0 by the way.

tfeldmann commented 1 year ago

Do you think this is a django issue? Should I raise it there?

carltongibson commented 1 year ago

First, can you reduce it to say exactly what's going on? (It's be closed as needs info without a reproduce)

tfeldmann commented 11 months ago

I cannot say what exactly is going on. The above code is enough to reproduce the issue, but I'm happy with providing a full project for this if needed?

carltongibson commented 11 months ago

Hi @tfeldmann -- thanks for the bump.

No need for a project... the script looks fine.

I haven't had a chance to look at this in a debugger, but will try and do so over the next period.

carltongibson commented 11 months ago

Of course, waitress is a WSGI server. Q: do we get the same issue with hypercorn/uvicorn?

tfeldmann commented 11 months ago

I found this in the spec (https://asgi.readthedocs.io/en/latest/specs/www.html#http-connection-scope):

Servers are responsible for handling inbound and outbound chunked transfer encodings. A request with a chunked encoded body should be automatically de-chunked by the server and presented to the application as plain body bytes; a response that is given to the server with no Content-Length may be chunked as the server sees fit.

I don't know how hypercorn / uvicorn handle this.

carltongibson commented 11 months ago

Yes... 🤔 The interesting one was that the body makes it (as far as we're aware).

(I need time to sit down with the debugger to be able to say more, but that's the question I'm asking first.)

agronick commented 6 months ago

I think this is a Django bug and I opened a ticket here. Transfer-encoding chunked should not have a content length per the HTTP spec but Django only uses the content-length for determining if uploaded files should be processed. There is a conditional in Django that gives up on uploaded files if the content-length is 0. You can make a middleware to spoof the content length to 1 if the transfer-encoding is Chunked e.g.:

        if request.headers["Transfer-Encoding"] == "chunked" and "CONTENT_LENGTH" not in request.META:
            request.META["CONTENT_LENGTH"] = "1"

and everything works correctly. You'll also want to set your FILE_UPLOAD_HANDLERS to just "django.core.files.uploadhandler.TemporaryFileUploadHandler" because the MemoryFileUploadHandler is just looking at the content-length.

nessita commented 6 months ago

Thank you @agronick for posting here, I was in the middle of my investigation to triage your Django ticket. I also found the (I think) related issue #371.

carltongibson commented 6 months ago

Closing this in favour of the Django ticket. Happy to reopen if there's something else Daphne needs to do here.

zoobab commented 4 months ago

@agronick do you have a patch example that works for you? Is there a patch to django code yet?