Frommi / miniz_oxide

Rust replacement for miniz
MIT License
179 stars 49 forks source link

Support a foolproof(er) higher-level API for streaming #25

Open golddranks opened 6 years ago

golddranks commented 6 years ago

Hi, I have wrapped the streaming (ring-buffer using) API of miniz_oxide behind a higher level API here: https://github.com/golddranks/stream_zipper/blob/master/src/deflate.rs

Is there any interest in supporting such a higher level API within miniz_oxide itself?

I've experimented with a state machine-like API:

pub enum State<'a> {
    HasMoreOutput(ChunkIter<'a>),
    NeedsMoreInput(InputSink),
    Stop(FinishedStream<'a>),
}

The idea is that every function that operates on the state takes self as value, and as such, parsing the stream can proceed as a state machine alternating between states "HasMoreOutput" and "NeedsMoreInput". The states own the internal state, including the ring buffer, which makes this a higher level API as the one provided by miniz_oxide at the moment.

Here's a roughly how it would be used:

        let mut state = start_deflate_stream();

        for chunk in pure_deflate_stream.chunks(500) {
            if let State::NeedsMoreInput(sink) = state {
                state = sink.feed(chunk)?;
            };
            while let State::HasMoreOutput(out) = state {
                println!("Got {:?}", out.get());
                state = out.next()?;
            }
        }

Another experiment I did was to have just InputSink, and then passing a callback that gets the uncompressed output. This is a kind of an internal iterator. Again, here's an example

        let mut state = start_deflate_stream();

        for chunk in pure_deflate_stream.chunks(50) {
            if let State::NeedsMoreInput(sink) = state {
                state = sink.inner_iter(chunk, |out| {
                    println!("Got {:?}", out);
                })?;
            };
        }

What do you think about introducing such a high level API?

golddranks commented 6 years ago

Btw. I can send a PR and work with the maintainers of this library to get it into a shape they want, I just want to know if you are open to such an idea so I won't do needless work.

Frommi commented 6 years ago

Hi. Sorry for the delayed reply. Your State is cleaner than what we have now, I really like it. I think there is a place for the safe wrapper that does not lose streaming functionality. Tomorrow I will have some time to go over it in detail and say something more concrete.

Frommi commented 6 years ago

Okay, I read the code and I think that you are onto something. There are two things that stand out for me:

So, a bit more generally:

I think the best base abstraction in this case will be this:

Then there are quality-of-life things like execution-till-next-input or iterator instead of input closure, etc.

What do you think? Does this makes sense?

golddranks commented 6 years ago

Sorry for the delayed reply. I'm on a vacation at the moment, I'll come back to this later when I have time!

golddranks commented 6 years ago

Just two quick things: I called it inner_iter because there may be multiple outputs for single input and you have to iterate through them before feeding it more input. (Imagine having chunks of one 100 kilobytes, for example!)

The reason why I'd like to avoid callback-style closure in the "basic" hi-level API is that they obstruct error handling because you can't effectively use the ? operator inside them.

Frommi commented 6 years ago

The point about the ? operator is very good. I now think that your first approach with API is best and I can't see ways to improve it. I thought about maybe somehow statically make sure that get was called before next in HasMoreOutput case, but there is no way to do it not extremely clunky.

golddranks commented 6 years ago

Do you want a PR then? I'm afraid that I'm currently quite busy, but I can try and work with it little by little?

Frommi commented 6 years ago

It's okay, I am mostly free on the weekends. Most of the work is already done anyway.

oyvindln commented 4 years ago

Hoping we can look into this again, the current rust API is a bit inconsistent and was quickly cobbled together to provide a basis for a C API mirroring miniz. It's not all that easy to use, and it would be nice to have an alternative to using it through flate2 in areas with no std-support.

The original post have one suggestion which may work for how the streaming part could look, maybe there are other suggestions, or some existing API designes out there that would work as well.