cetra3 / mpart-async

Asynchronous Multipart Requests for Rust
Apache License 2.0
32 stars 13 forks source link

MultipartStream stuck on field #7

Closed koivunej closed 4 years ago

koivunej commented 4 years ago

Hi there,

Just tried this crate up; it would be exactly what we'd need over at rs-ipfs/rust-ipfs. Uploads tend to become stuck with high cpu usage as the part stream returns endlessly Some(Ok(b"")).

Sorry for the paste quality; was hoping to PR the fix but I am a bit lost. Test case can be appended to src/server.rs.

Overly large reproducer ``` #[test] fn zero_read() { let input = b"\ ----------------------------332056022174478975396798\r\n\ Content-Disposition: form-data; name=\"file\"\r\n\ Content-Type: application/octet-stream\r\n\ \r\n\ \r\n\ \r\n\ dolphin\n\ whale\r\n\ ----------------------------332056022174478975396798--\r\n"; let boundary = "--------------------------332056022174478975396798"; let mut read = MultipartStream::new(boundary, ByteStream::new(input)); let mut part = match block_on(read.next()).unwrap() { Ok(mf) => { assert_eq!(mf.name().unwrap(), "file"); assert_eq!(mf.content_type().unwrap(), "application/octet-stream"); mf } Err(e) => panic!("unexpected: {}", e), }; let first = block_on(part.next()).unwrap().unwrap(); // this should be b"dolphin\nwhale" (13 bytes, no trailing \n) // but it also contains the end boundary: // "\r\n\r\ndolphin\nwhale\r\n----------------------------332056022174478975396798--\r\n" print!("first: "); for b in first.as_ref() { print!("{:02x}", b); } println!(); println!("first: {:?}", String::from_utf8_lossy(first.as_ref())); // the stream is already exhausted, but this is Some(Ok(b"")) // same for third; the part is now "stuck" let second = block_on(part.next()); println!("second: {:02x?}", second); // this is empty let third = block_on(part.next()); println!("third: {:02x?}", third); // this is empty let fourth = block_on(read.next()); println!("fourth: {:02x?}", fourth.unwrap().unwrap()); // this is error ShouldPollField assert_eq!(first.len(), 8 + 5); // this is correct } ```

~Whoops, posted too early, editing this here in a sec :)~ Added.

I figure there is an issue detecting the final boundary. I've extracted the request body from wireshark so there might still be copy paste issues, but this at least demonstrates the "getting stuck" which definitely happens on https://github.com/rs-ipfs/rust-ipfs/pull/225.

koivunej commented 4 years ago

Quite sure the culprit is that https://github.com/cetra3/mpart-async/blob/master/src/server.rs#L371. If the check fails and this isn't a boundary then we should be returning the bytes up to and including idx and next time look further..

I am quite sure the client is buggy when it's sending \r\n\r\n\r\n following the multipart headers instead of \r\n\r\n but I can easily handle that based on user agent in my quite restricted case. Continuing towards PR and I'll be updating this issue until I have one.

Offtopic, I can create a separate issue but would you also be interested in a PR which removes the use of anyhow in the server.rs bounds? It's use seems quite unfortunate and it can be replaced with std::error::Error at least in the scope of this library, let caller who wants to use anyhow wrap the error in anyhow if they so choose.

koivunej commented 4 years ago

Oki, got my test case at least passing.

With my changes I am now thinking that:

The smaller buffers which I now produce aren't a great idea. To get away from that, I think the whole MultipartParser::poll_next might need to become a loop.

cetra3 commented 4 years ago

Yeah it looks like there are some interactions with having \r inside the body of the part causing some weirdness around driving the buffer.

If you have a PR in the making, happy to accept it! Otherwise I will have a go.

koivunej commented 4 years ago

Creating a PR in just a sec, cleaning it up.

cetra3 commented 4 years ago

@koivunej Thanks for the PR I have merged it & cut a release for you. Please feel free to make other changes (I.e, BytesMut and anyhow as you mentioned) and we will work through them on separate PRs.