defnull / multipart

A fast multipart/form-data parser for python
MIT License
126 stars 33 forks source link

v0.2, py3 only #16

Closed theelous3 closed 5 years ago

theelous3 commented 5 years ago

The commits are fairly atomic, aside from the giant reformatting pass. Due to this, the diffs are not particularly nice to read. I'd recommend simply reading the files in their new state to review, combined with checking each commit of substance. The commits are readable enough in themselves.

I know you didn't feel there was a need to do a general formatting pass, but it really was quite unpleasant to read so I blacked the code anyway. There were lumps of 20+ lines with varying indentations and amounts of complexity all on top of each other. Same goes for some variable names like ctype. No reason to have content_type shortened to ctype. I still have no idea what nl means, so I'd recommend expanding that one yourself. New Line? Anyway - there were formatting changes. I'm not saying that a pass with black should be a requirement, but just that it was used to give the codebase a spring cleaning.

Automated tools do not see context, and will enforce pep-8 even in situations where it makes the code worse.

In this case, I think there were no cases where the code was made worse.

Things I didn't do but would like, with discussion:

defnull commented 5 years ago

The formatting changes are fine. I usually do not like big formatting patches because it destroys the ability to git blame, but this project is small enough so that does not really matter, and readability really improved in some places. The other changes also look fine. I'll run tests tomorrow and most likely pull. You might want to add a CONTRIBUTORS.md and add yourself to it, if you want :)

As for the other remarks:

theelous3 commented 5 years ago

These can be replaced with normal byte string literals now, in most places.

For the most part I did, yes. Or rather I replaced the to_bytes calls with literals and left the use of the vars in. I didn't want to touch any more than I needed to, because it does work. I'm guessing the tests would need to be changed if we go too deep with a knife here.

Regarding async, if we want to fully have this all nice and memory efficient asyncy, yes, there would be a fair enough refactor required to have it spit out parts.

For now however, I was more thinking of just removing things like save_as so that under the hood, there are no choices being made about io in what should otherwise be an io agnostic tool. Also refactoring away the tempfile stuff entirely, and instead just returning iterable parts and let the user make storage decisions about very large bodies.

Taking bytes and returning objects based on a protocol shouldn't do anything surprising like use disk, regardless of how explicitly named the func is, imo.

Anyway, I won't include anything like that in this PR. Maybe in the future :)

I don't mind about contribution. I'm in the git log anyway :D