pocoproject / poco

The POCO C++ Libraries are powerful cross-platform C++ libraries for building network- and internet-based applications that run on desktop, server, mobile, IoT, and embedded systems.
https://pocoproject.org
Other
8.39k stars 2.16k forks source link

Poco Multipart parsing is 10x slower than its Boost/beat or restinio equivalent #4118

Open datinje opened 1 year ago

datinje commented 1 year ago

This is not an issue (well .. up to you - since there is a second way to implement with poco as describe below) , but Is there a reason for Poco checking each element of a REST multi part stream , consequence is 10x performance drop vs Boost/Beast or restinio ?

details : We have found one bottleneck in the built-in Poco::Net::MultipartReader : it keeps calling a method that basically does a ‘read_new_line’ for consuming all bytes in the payload of every parts. This is slow because every character is compared to ‘\r’ ‘\n’ and ‘\0’. For a 500bytes Jason part + 13Mbytes binary image part , performance drop is 10x versus Beast.

Actual code from poco:

As a note , in our test, we always provide the content-length of the part in headers after the boundary. This allows a more efficient algorithm, see the activity diagram attached. Besides clients adhere more to the standard if doing so. image

Fixing this is very easy, thanks to the std::stream API that poco provides, and our test resulted in similar speed for poco and beast.

After this change perf in our test is on par with Boost/Beast implementation .

So can you explain

datinje commented 1 year ago

sorry finishing : so can you explain the reason of this element by element check ?

obiltschnig commented 1 year ago

At least in emails nobody seems to send a Content-Length in the part headers, so we have to read line-by-line in order to detect the boundary. However, it makes sense to add an optimization for the case a Content-Length is provided.

datinje commented 1 year ago

So I understand the reason of this check was to for the code to be more general. And Your improvement is going to improve perf by 10x for this specific case. Thanks a lot ! A big thank to my colleague Thibault Pierre who found that out in his tests and to @obiltschnig for the quick reaction !

aleks-f commented 1 year ago

@datinje do you intend to send this as PR? 1.13.0 is scheduled for Monday

matejk commented 2 weeks ago

@datinje, can you create a PR for this improvement, please?

Poco 1.14 is going to be released soon and the PR could get included there.