defnull / multipart

A fast multipart/form-data parser for python
https://multipart.readthedocs.io/
MIT License
135 stars 33 forks source link

New push based (non-blocking) parser #52

Closed defnull closed 2 months ago

defnull commented 3 months ago

This PR introduces a new PushMultipartParser that avoids any form of (blocking) IO, which allows it to be used in async contexts. It is also significantly faster (x2 - x10) and less susceptible for certain worst-case inputs.

The old (blocking) MultipartParser API now uses this new PushMultipartParser internally and benefits from all improvements.

See https://sans-io.readthedocs.io/ on why this type of parser is preferable and please test and benchmark this new implementation if you have the time.

Still missing:

defnull commented 3 months ago

Hey @cjwatson, care to give this a try and a core review. I'm planning a release with this new parser, and since the old API was not changed at all and remains backwards compatible, it can be a minor release.

theelous3 commented 3 months ago

Hey nice!

I have an old fork of this repo to accomplish the same thing (and some other changes) here

It may or may not be of use in contemplating your update.

defnull commented 2 months ago

@theelous3 I vaguely remember our discussions back then. The main difference is probably that you followed the very common event-based SansIO API design more closely, while I tried to make the API more approachable and less error prone so it can be directly used by application developers if needed. The typical event-based SansIO API is more targeted at framework developers I think. It's quire easy to use them in a wrong way (e.g. drop events) and they are more verbose than what I had in mind.

defnull commented 2 months ago

Only thing missing for a merge and release is documentation. Test coverage is 100% by now and performance tests are very promising.