Open datinje opened 1 year ago
sorry finishing : so can you explain the reason of this element by element check ?
At least in emails nobody seems to send a Content-Length
in the part headers, so we have to read line-by-line in order to detect the boundary. However, it makes sense to add an optimization for the case a Content-Length
is provided.
So I understand the reason of this check was to for the code to be more general. And Your improvement is going to improve perf by 10x for this specific case. Thanks a lot ! A big thank to my colleague Thibault Pierre who found that out in his tests and to @obiltschnig for the quick reaction !
@datinje do you intend to send this as PR? 1.13.0 is scheduled for Monday
@datinje, can you create a PR for this improvement, please?
Poco 1.14 is going to be released soon and the PR could get included there.
This is not an issue (well .. up to you - since there is a second way to implement with poco as describe below) , but Is there a reason for Poco checking each element of a REST multi part stream , consequence is 10x performance drop vs Boost/Beast or restinio ?
details : We have found one bottleneck in the built-in Poco::Net::MultipartReader : it keeps calling a method that basically does a ‘read_new_line’ for consuming all bytes in the payload of every parts. This is slow because every character is compared to ‘\r’ ‘\n’ and ‘\0’. For a 500bytes Jason part + 13Mbytes binary image part , performance drop is 10x versus Beast.
Actual code from poco:
As a note , in our test, we always provide the content-length of the part in headers after the boundary. This allows a more efficient algorithm, see the activity diagram attached. Besides clients adhere more to the standard if doing so.
Fixing this is very easy, thanks to the std::stream API that poco provides, and our test resulted in similar speed for poco and beast.
After this change perf in our test is on par with Boost/Beast implementation .
So can you explain