Overall, the body of a "multipart" entity may be specified as
follows:
dash-boundary := "--" boundary
; boundary taken from the value of
; boundary parameter of the
; Content-Type field.
multipart-body := [preamble CRLF]
dash-boundary transport-padding CRLF
body-part *encapsulation
close-delimiter transport-padding
[CRLF epilogue]
transport-padding := *LWSP-char
; Composers MUST NOT generate
; non-zero length transport
; padding, but receivers MUST
; be able to handle padding
; added by message transports.
encapsulation := delimiter transport-padding
CRLF body-part
delimiter := CRLF dash-boundary
close-delimiter := delimiter "--"
preamble := discard-text
epilogue := discard-text
discard-text := *(*text CRLF) *text
; May be ignored or discarded.
body-part := MIME-part-headers [CRLF *OCTET]
; Lines in a body-part must not start
; with the specified dash-boundary and
; the delimiter must not appear anywhere
; in the body part. Note that the
; semantics of a body-part differ from
; the semantics of a message, as
; described in the text.
OCTET := <any 0-255 octet value>
As per this spec, the simplest multipart would look like this:
There is one CRLF required at the end of the body, not two. In fact, the Google App Engine posts data internally that contains only 1 CRLF when a form field is left empty (the example below is using the data it generates).
Step to reproduce:
from requests_toolbelt.multipart import decoder
data = b'--foo\r\nContent-Type: text/plain; charset="UTF-8"\r\nContent-Disposition: form-data; name=empty\r\n\r\n--foo\r\nContent-Type: text/plain; charset="UTF-8"\r\nContent-Disposition: form-data; name=text\r\n\r\nSome Text\r\n--foo--'
decoder.MultipartDecoder(data, 'multipart/form-data; boundary="foo"')
output:
Traceback (most recent call last):
File "/Users/christophe/toolbelt.py", line 4, in <module>
decoder.MultipartDecoder(data, 'multipart/form-data; boundary="foo"')
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/requests_toolbelt/multipart/decoder.py", line 111, in __init__
self._parse_body(content)
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/requests_toolbelt/multipart/decoder.py", line 150, in _parse_body
self.parts = tuple(body_part(x) for x in parts if test_part(x))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/requests_toolbelt/multipart/decoder.py", line 150, in <genexpr>
self.parts = tuple(body_part(x) for x in parts if test_part(x))
^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/requests_toolbelt/multipart/decoder.py", line 141, in body_part
return BodyPart(fixed, self.encoding)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/requests_toolbelt/multipart/decoder.py", line 63, in __init__
raise ImproperBodyPartContentException(
requests_toolbelt.multipart.decoder.ImproperBodyPartContentException: content does not contain CR-LF-CR-LF
For comparison, here is the same data processed with cgi:
This error appears even though there is no requirement in RFC 2046 to have the body end with 2 CR-LF. From https://www.rfc-editor.org/rfc/rfc2046.html#section-5.1.1:
As per this spec, the simplest multipart would look like this:
There is one CRLF required at the end of the body, not two. In fact, the Google App Engine posts data internally that contains only 1 CRLF when a form field is left empty (the example below is using the data it generates).
Step to reproduce:
output:
For comparison, here is the same data processed with cgi:
Output: