I ran into a problem last year: when I tried to create or synchronize a challenge containing a large file (i.e. a forensics challenge with a 15 GB disk image), the entire file was put into memory before starting the request
This causes crashes since I only have 16GB of RAM in my computer.
The cause
Although the requests module supports body streaming when you pass a file pointer to the data parameter, it is not capable of streaming form-data.
When the requests module prepares the headers, it tries to calculate the Content-Length. As a result, the entire body will be stored in memory.
The fix
One solution would be to switch to another HTTP client, capable of streaming form-data.
I chose to modify as little code as possible. I made the choice to delegate the body encoding to the MultipartEncoderfrom the requests-toolbelt module. This requires a few modifications to the API class, since the MultipartEncoder takes parameters differently from requests.
As a result files must be sent with a filename hint:
# Before
api.post("/api/v1/files", files=[
( "file", open("./file.ova") )
], data={ ... })
# After
api.post("/api/v1/files", files={
"file": ( "file.ova", open("./file.ova"))
}, data={ ... })
# If you want to send multiple files under the key "file", you can use tuple or list instead of dict
api.post("/api/v1/files", files=[
("file", ( "file.ova", open("./file.ova"))),
("file", ( "description.txt", open("./description.txt")))
], data={ ... })
The bug
I ran into a problem last year: when I tried to create or synchronize a challenge containing a large file (i.e. a forensics challenge with a 15 GB disk image), the entire file was put into memory before starting the request
This causes crashes since I only have 16GB of RAM in my computer.
The cause
Although the
requests
module supports body streaming when you pass a file pointer to thedata
parameter, it is not capable of streaming form-data.When the
requests
module prepares the headers, it tries to calculate theContent-Length
. As a result, the entire body will be stored in memory.The fix
One solution would be to switch to another HTTP client, capable of streaming form-data.
I chose to modify as little code as possible. I made the choice to delegate the body encoding to the
MultipartEncoder
from therequests-toolbelt
module. This requires a few modifications to theAPI
class, since theMultipartEncoder
takes parameters differently fromrequests
.As a result files must be sent with a filename hint: