theelous3 / asks

Async requests-like httplib for python.
MIT License
508 stars 63 forks source link

Use a bytearray object instead of bytes for reading response (-> quadratic to linear complexity for large responses) #169

Closed palkeo closed 4 years ago

palkeo commented 4 years ago

I noticed that when reading large responses (without using streaming), asks could spend half an hour of CPU stuck in a loop while not doing anything.

After profiling is seems like the time is spent in _catch_response. And the code is concatenating bytes() together repeatedly. As bytes are not mutable python has to make a new copy of the whole response every time we concatenate a bit more data. As the response grows it does it over and over and eats all CPU.

Storing the response in a mutable bytearray() type completely removed that bottleneck and made everything instantaneous.

I didn't test this as the test suite seems to fail completely (in curio, "TypeError: spawn() got an unexpected keyword argument 'report_crash'"). How should I test ?

theelous3 commented 4 years ago

Tests seem to be passing fine on travis anyway: https://travis-ci.org/github/theelous3/asks/builds/717218391

Nice catch :)