Closed bradwood closed 5 years ago
GitMate.io thinks the contributor most likely able to help you is @asvetlov.
Possibly related issues are https://github.com/aio-libs/aiohttp/issues/2711 (No content), https://github.com/aio-libs/aiohttp/issues/2062 (Content-Length header), https://github.com/aio-libs/aiohttp/issues/2183 ('None' in HTTP headers), https://github.com/aio-libs/aiohttp/issues/813 (Why uppercase HTTP headers?), and https://github.com/aio-libs/aiohttp/issues/14 (HttpResponse doesn't parse response body without Content-Length header and Connection: close).
It's not a chunked encoded body but multipart/form-data
encoded form.
Please use MultipartReader(resp.headers, resp.content)
to extract form data.
Its not form data. Its a large XML payload.
Check resp.headers
.
Your log looks like a multipart message with a large XML payload inside
Sorry, I'm confused.
I want the body, not the headers. Essentially, I want to be able to loop over body chunks to write out the data file, without headers.
Here is the code (a test using aresponses
) that is mocking the server, if that helps
@pytest.mark.asyncio
async def test_listing_fetch(aresponses):
# custom handler to respond with chunks
async def my_handler(request):
LOGGER.debug('in handler')
my_boundary = 'boundary'
xmlfile_path = Path(__file__).resolve().parent.joinpath('6729.xml')
LOGGER.debug('xml file path = {xmlfile_path}')
resp = aresponses.Response(status=200,
reason='OK',
)
resp.enable_chunked_encoding()
await resp.prepare(request)
xmlfile = open(xmlfile_path, 'rb')
LOGGER.debug('opened xml file for serving')
with MultipartWriter('application/xml', boundary=my_boundary) as mpwriter:
mpwriter.append(xmlfile)
LOGGER.debug('appended chunk')
await mpwriter.write(resp, close_boundary=False)
LOGGER.debug('wrote chunk')
xmlfile.close()
return resp
aresponses.add('foo.com', '/feed/6715', 'get', response=my_handler)
with isolated_filesystem():
l = Listing('http://foo.com/feed/6715')
await l.fetch()
assert l._path.joinpath(l._filename).is_file()
Please read about multipart encoding first: https://en.wikipedia.org/wiki/MIME#Multipart_messages
Your mocked server is invalid: application/xml
is for the entire xml content, not for multiparts.
P.S. A thing you call chunk is a multipart's part actually. The work chunk is used for another concept, at least in HTTP protocol.
So how to make a server then that emulates support for Range headers?
Here is a HEAD request on the server I'm trying to emulate:
HTTP/1.1 200 OK
Accept-Ranges: bytes
Connection: keep-alive
Content-Encoding: gzip
Content-Type: application/xml
Date: Sun, 07 Oct 2018 23:11:56 GMT
ETag: "f8889f-577999e0b6f7d-gzip"
Last-Modified: Sun, 07 Oct 2018 01:42:28 GMT
Server: nginx/1.11.10
Vary: Accept-Encoding
How can I make aiohttp
behave like that? If it's in the docs, then maybe I missed it, or go confused between multipart and "streaming".
Thanks for your help.
It should respond like this when a Range: request is given:
(pyskyq-4vSEKDfZ) ✘-INT [brad@bradmac:~/Code/pyskyq/tests] [31-epg-enh|✚ 2] $ curl http://www.xmltv.co.uk/feed/6715 -i -H "Range: bytes=0-1023"
HTTP/1.1 206 Partial Content
Server: nginx/1.11.10
Date: Mon, 08 Oct 2018 07:08:53 GMT
Content-Type: application/xml
Content-Length: 1024
Connection: keep-alive
Last-Modified: Mon, 08 Oct 2018 01:42:20 GMT
ETag: "f9f199-577adbb5d510e"
Accept-Ranges: bytes
Vary: Accept-Encoding
Content-Range: bytes 0-1023/16380313
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE tv SYSTEM "xmltv.dtd">
<tv generator-info-name="xmltv.co.uk" source-info-name="xmltv.co.uk">
<channel id="003b31fb0fd63bd8fd171c7d7a1d0249">
<display-name>GEO News</display-name>
</channel>
<channel id="0092ad6b181b813d9e2ceed1cfbf5bf1">
<display-name>Notts TV</display-name>
</channel>
<channel id="00da025711e82cf319cb488d5988c099">
<display-name>Dunya News</display-name>
</channel>
Is this type of server supported in aiohttp? using Multipart*
objects? or Stream*
? I have been digging through the docs for this but it's not clear.
The latest response is neither streaming nor multipart.
It is just a regular response with truncated body: web.Response(status=206, headers={<fill them itself>}, body=xml_bytes[:1000])
.
I'm closing the issue because it is not about aiohttp bugs/improvements but teaching @bradwood HTTP protocol.
Please use another site for it. Maybe StackOverflow fits better.
I don't need to be taught about the HTTP protocol on this forum. @asvetlov. I am perfectly capable of reading wikipedia and RFCs just like you.
I am asking about aiohttp support for this. Does it support it, or not? Please refer me to the documentation, if so, or tell me that it doesn't.
Did you not read this?
Is this type of server supported in aiohttp? using Multipart objects? or Stream? I have been digging through the docs for this but it's not clear.
FWIW, while I may have made a mistake in interpretation earlier, I don't appreciate your comment about teaching me HTTP. There is no need for rudeness.
I've been reading your responses to many people on this forum - you are extremely rude with many of them. You like to tell them to read wikipedia, instead of actually being helpful. Its condescending and unhelpful. In many cases, these questions are as a result of poorly documented examples of how aiohttp
implements, or doesn't, a particular feature, not the protocol itself.
Look, don't get me wrong, I appreciate your contribution to the community, but it would be much better if (a) the docs were improved so that answers could be found without raising tickets and (b) if you were less dismissing and insulting to people who have legitimate questions about the codebase, not the protocol.
aiohttp request supports request.range
property to help Range HTTP header parsing. It supports ranged requests in static file serving. The library doesn't provide a magic helper for returning a ranging response for arbitrary data -- a user should construct this response manually.
The main github tracker mission is the development of aiohttp, not for aiohttp usage. For example, CPython itself forbids questions about Python usage in its bug tracker and python-dev mailing list. Should we enable the same policy for aiohttp? I don't know, but this tracker is not a place for general questions. It is not a forum or questions-and-answers resource.
We have a different understanding of rudeness. Pointing on a helpful resource for future reading is a good response for me. RTFM and so far. If it is not enough for you -- that's fine. Please use another sits like stackoverflow.com for asking the usage questions.
The documentation is never perfect. It always can be improved. Please make Pull Request(s) for documentation improvement. I very appreciate it.
- aiohttp request supports
request.range
property to help Range HTTP header parsing. It supports ranged requests in static file serving. The library doesn't provide a magic helper for returning a ranging response for arbitrary data -- a user should construct this response manually.
Ok great -- this is helpful - I will do that. I thought there might be a higher level API that did this, as it the case for Streams and Multipart -- so not an unreasonable question IMHO.
- The main github tracker mission is the development of aiohttp, not for aiohttp usage. For example, CPython itself forbids questions about Python usage in its bug tracker and python-dev mailing list. Should we enable the same policy for aiohttp? I don't know, but this tracker is not a place for general questions. It is not a forum or questions-and-answers resource.
Ok, well initially I thought it was a bug, rather than a usage query and I'd assert that insofar as the way in which something can be used, or not used, is part of it's development agenda. If you make something that is difficult to use, or understand, then surely it's a (usability) bug?
- We have a different understanding of rudeness. Pointing on a helpful resource for future reading is a good response for me. RTFM and so far. If it is not enough for you -- that's fine. Please use another sits like stackoverflow.com for asking the usage questions.
Ok -- fair enough... I think the Robustness Principle should apply here too... I honestly thought this was a bug/weakness in the API which was a legitimate query. While I don't know every HTTP RFC by heart, I do think I know enough about it to ask relevant questions about aiohttp
's implementation of bits of it. So being told that you are not going to "teach someone HTTP" is a pretty blunt response to a legitimate query.
- The documentation is never perfect. It always can be improved. Please make Pull Request(s) for documentation improvement. I very appreciate it.
When time permits, and I've got a working example for this topic, I'll try to do exactly that.
Sorry for my attitude and thanks for understanding.
Long story short
I am doing a chunked download from a Range enabled HTTP server and it appears to be including HTTP headers, rather than just the body.
Am i using the library incorrectly, or is this a bug?
How do I get only chunks of the body, excluding the HTTP headers and the
--boundary
delimiter?Thanks!
Expected behaviour
I expected that only pieces of the HTTP body would be returned when calling
resp.content.read(chunk_size)
Actual behaviour
The chunks come down correctly, but the headers and boundary delimiters are present in the resulting file.
Steps to reproduce
Here is the code in question:
and here is the head of the resulting file:
Your environment