Closed ldko closed 3 years ago
When indexing a WARC file with records containing Content-Type: multipart/form-data (missing "boundary" such as in multipart/form-data; boundary=----WebKitFormBoundaryrdRXu11VSoXKFFBV), the indexing fails at:
Content-Type: multipart/form-data
multipart/form-data; boundary=----WebKitFormBoundaryrdRXu11VSoXKFFBV
https://github.com/webrecorder/pywb/blob/7b51101b040628ce6ceddb7bd79440b03c0081d4/pywb/warcserver/inputrequest.py#L262
with ValueError: Invalid boundary in multipart form: b''
ValueError: Invalid boundary in multipart form: b''
Download this sample WARC (created with Brozzler) that contains records with Content-Type: multipart/form-data. Try to index the WARC.
The indexing process should not choke on the inadequate Content-Type header.
Fixed in #599 .
Describe the bug
When indexing a WARC file with records containing
Content-Type: multipart/form-data
(missing "boundary" such as inmultipart/form-data; boundary=----WebKitFormBoundaryrdRXu11VSoXKFFBV
), the indexing fails at:https://github.com/webrecorder/pywb/blob/7b51101b040628ce6ceddb7bd79440b03c0081d4/pywb/warcserver/inputrequest.py#L262
with
ValueError: Invalid boundary in multipart form: b''
Steps to reproduce the bug
Download this sample WARC (created with Brozzler) that contains records with
Content-Type: multipart/form-data
. Try to index the WARC.Expected behavior
The indexing process should not choke on the inadequate Content-Type header.
Environment