luizirber / niffler

Simple and transparent support for compressed files.
Apache License 2.0
75 stars 7 forks source link

Allow user to choose wich backend library is used #24

Closed natir closed 2 years ago

natir commented 4 years ago

User can want control wich backend library is used:

A solution could be creating many features, with name like {compression format name}-{crate backend name}.

For example if feature gzip and gzip-flate2 is enable niffler use crate flate2 to manage gzip data. We need to define a default backend for each compression format

This system will make the code more complex, greatly increase the number of features, there will probably be redundancy in the code between features.

But it's the only system that I think allows that functionality.

This issue have linked with #21

luizirber commented 4 years ago

Something to keep in mind: we need to define features in a way that don't conflict with each other. Sometimes people define features to select a specific backend (tokio or async-std, for example) but that usually fails with cargo test --all-features because it was defined as an "exclusive feature", not an "inclusive feature".

More info: https://doc.rust-lang.org/cargo/reference/features.html#rules Cargo subcommand to help with CI and testing all feature combinations: https://github.com/frewsxcv/cargo-all-features/

natir commented 4 years ago

We can define order between backend, if user activate feature gzip-flate2 an gzip-wasm-flate we use flate2backend.

What is the good order backend it's a very good question.

luizirber commented 4 years ago

Another situation where exposing backends as features is useful: https://github.com/onecodex/needletail/pull/45

natir commented 4 years ago

The more I think about it, the more I think that the feature system might not be the best one.

The feature system that I proposed above may be efficient but would also be very complicated to implement. And niffler would always impose the version of the lib backend he is using and if the user wants to extend for another backend he will have to wait for niffler to integrate it.

I thought of a system where niffler would call function pointers. It would allow the user to create his own function to manage the compression he wants. But it would require either a global Niffler variable that stores wich function it calls for each format, or an API that looks like this:

get_reader(steam, Option<function_for_gzip>, Option<function_for_bzip>, Option<function_for_lzma>, …)

But it would also have a small performance impact because there would be an additional pointer indirection.

I've also thought about replacing the API with a macro system, the user could create these macros to decide how to handle the files himself. I need to learn more stuff about rust macro system before write a draft of this idea.

luizirber commented 4 years ago

Another idea: recommend "advanced usage" of niffler if the user doesn't agree with our choice of dependencies/backends?

Since niffler::sniff is exposed, we could add to the docs something like:


if you want to use another crate for decompression, you can still use niffler::sniff to detect the format and then use the library you want. For example,

let stream = ...; // Anything supporting `io::Read`
let (reader, format) = niffler::sniff(Box::new(stream))?;
if let niffler::compression::Format::Gzip = format {
    // parse `reader` with your favorite Gzip crate
}

natir commented 4 years ago

This solution is simple, and we didn't need to write more code I lke it !