Open thehesiod opened 6 years ago
looks like in my example the lines were separated by \n's, and not \r\n
I've traced this back to the fact that the default linesep for the email module in python is '\n': https://github.com/python/cpython/blob/master/Lib/email/_policybase.py#L163 used by https://github.com/google/google-api-python-client/blob/master/googleapiclient/http.py#L1328 since policy=None. So the question is should aiohttp check for '\n' + boundary or `\r\n' + boundary as it currently does.
in particular the issue is on this line: https://github.com/aio-libs/aiohttp/blob/master/aiohttp/multipart.py#L324
https://tools.ietf.org/html/rfc7578 explicitly states CRLF as a separator. We could support LF also though but I not sure. Guys, opinions?
Only if optional to be explicitly enable by user.
From Python docs about linesep
:
It defaults to ``\n`` because that is the most useful value for Python application code (other library packages expect ``\n`` separated lines). ``linesep=\r\n`` can be used to generate output with RFC-compliant line separators.
I think we should stay with RFC by default.
Well it's working with Google servers. We should check how other servers behave.
On Oct 5, 2017 2:13 AM, "Alexander Shorin" notifications@github.com wrote:
Only if optional to be explicitly enable by user.
From Python docs about linesep:
It defaults to
\n
because that is the most useful value for Python application code (other library packages expect\n
separated lines).linesep=\r\n
can be used to generate output with RFC-compliant line separators.I think we should stay with RFC by default.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/aio-libs/aiohttp/issues/2302#issuecomment-334407530, or mute the thread https://github.com/notifications/unsubscribe-auth/AD0P_f4VmhADU5gqVrpqjSm-Z59OHLpmks5spJ3AgaJpZM4PuS4i .
I don't think this check will change everything. We must follow the specifications in all the questions. That's the only way to make sure we're doing everything right.
However, we have to admit that life is...complicated and legacy still exists and backward compatibility with ancients still have to be preserved by major providers etc. etc. For such cases, if they are really worth to be supported (and if they cannot fix their code due to same BC), well, some fix may have to be implemented, but as opt-in feature.
Even not all browsers correctly support multipart spec, so I think some room of customization is possible here.
The issue is still waiting for a hero.
We need to accept separator
argument in both multipart reader and writer (\r\n
by default).
The change is not very hard but requires some amount of work.
btw, I'd like to be able to optionally support both styles, so perhaps a boolean instead of specify if should allow for both?
Do you mean a heuristic for detection what style should be used by the reader, \r\n
or just \n
?
Writer should know the style on construction.
yes, ala splitlines
. There should be a mode where the several can accept either way, that's what the google servers do for example. I bet apache as well.
@thehesiod would you work on implementation? The first version can be done without autodetection.
ya I'll give it a shot
Cool!
https://www.w3.org/Protocols/rfc2616/rfc2616-sec19.html#sec19.3 (but it's obsolete since June 2014 by RFC7230):
19.3 Tolerant Applications ... The line terminator for message-header fields is the sequence CRLF. However, we recommend that applications, when parsing such headers, recognize a single LF as a line terminator and ignore the leading CR. ...
(https://stackoverflow.com/a/5757349/595220)
https://tools.ietf.org/html/rfc7231#section-3.1.1.3:
3.1.1.3. Canonicalization and Text Defaults ... An HTTP sender MAY generate, and a recipient MUST be able to parse, line breaks in text media that consist of CRLF, bare CR, or bare LF.... This flexibility regarding line breaks applies only to text within a representation that has been assigned a "text" media type; it does not apply to "multipart" types or HTTP elements outside the payload body (e.g., header fields). ... 3.1.1.4. Multipart Types ... The message body is itself a protocol element; a sender MUST generate only CRLF to represent line breaks between body parts.
I'm trying to mock the google API batch endpoint which takes a multipart request. It ends up sending a request like this:
with headers:
And with a handler like the following:
I get the exception:
seems like it's not correctly determining the boundary