gendx / lzma-rs

An LZMA decoder written in pure Rust
MIT License
127 stars 27 forks source link

Expose decoder/encoder primitives as public #72

Closed chyyran closed 2 years ago

chyyran commented 2 years ago

I maintain a Rust implementation of MAME's Compressed Hunks of Data format, which is essentially chunks of data compressed using various compression algorithms, LZMA included. One of the quirks of this format is that because they use a very old LZMA 19.0, the encoding parameters are not saved into the output stream, and therefore every CHD decoder has to essentially mimic the defaults of LZMA 19.0.

Thankfully lzma-rs allows this but since the encode and decode modules aren't public, I had to fork and vendor lzma-rs to access those primitives like LzmaParams in my crate. Additionally I added LzmaParams::new to construct an instance manually for this purpose.

Ideally I would like to just be able to use mainline lzma-rs. Since decoder is mostly documented already, would it be feasible to expose those primitives as public without much work, possibly behind a feature flag?

gendx commented 2 years ago

Since decoder is mostly documented already, would it be feasible to expose those primitives as public without much work, possibly behind a feature flag?

It surely would be feasible! Could you send a pull request for review?

I don't think a feature flag is needed here - just make sure that the newly exposed API is documented and that the GitHub workflows pass.

chyyran commented 2 years ago

I did some internal refactoring to expose the raw decoder as a primitive where the LZMA compression parameters are encoded aas const generics. I also did a similar wrapper for LZMA2 although I'm not too sure about the usefulness of that in my fork.

I believe this is a better approach than just exposing LzmaParams and I was able to expose this API without changing the existing public API, but at the cost of having to make some relatively sizeable internal changes, particularly with DecoderState to also allow reusing the existing allocation.

If you could take a quick look to see if you're comfortable with those changes and the shape of the LzmaDecoder and Lzma2Decoder APIs I'd be happy to clean it up slightly and PR it upstream.

gendx commented 2 years ago

Some comments:

Feel free to break down your changes into multiple PRs (exposing LzmaParams, improving the reset implementation, adding some const generics, etc.) to make it easier to review.

chyyran commented 2 years ago

With that said I'd be happy to split decoupling output from DecoderState into a separate PR since that doesn't affect any public APIs and build on top of that.