dropbox / rust-brotli

Brotli compressor and decompressor written in rust that optionally avoids the stdlib
https://dropbox.tech/infrastructure/-broccoli--syncing-faster-by-syncing-less
BSD 3-Clause "New" or "Revised" License
812 stars 83 forks source link

Convert to bool BasicHashComputer::use_dictionary and proper case #171

Open nyurik opened 6 months ago

nyurik commented 6 months ago

Since this is a breaking change, refactoring the naming as well (perhaps should rename other trait methods for consistency?)

nyurik commented 4 months ago

hi, is this ok to merge?

danielrh commented 4 months ago

I don't understand why this change alters USE_DICTIONARY but not the other 3 in the trait. I think we should either adjust all of them or pause until we can.... Also can we make this trait private--does it need to be public?

nyurik commented 4 months ago

@danielrh I agree that naming should be consistent with the rest of the trait (and actually everywhere else in the crate too). Also, I don't want to do breaking changes too often (which result in the new major releases which doesn't get updated too easily).

I propose we:

What do you think?

danielrh commented 4 months ago

I think its' fine to merge the repos if that makes it easier. We need to keep the crates separate... otherwise you could end up with massive code duplication consider this scenario

Crate A requires Brotli Decompression only Crate B requires Brotli Decompression and Brotli Compression

if crate C requires both Crate A and Crate B, then there will be 2 copies of the decompressor under your plan. That means two copies of the dictionary, etc, and straining of the cache. I know several users of this crate that would effectively fork and stick to the old versions if we tried to do that. Lets keep the arrangement as is, but if we want to move things into an atomic repo we can...

The APIs were designed to reflect the public C brotli headers--so we may be able to remove fewer of them than we hope

nyurik commented 4 months ago

@danielrh unless I am mistaken, if both A and B creates use the same version of brotli, there will be only one "code" instance. That's why cargo tries to pick dep version that matches the requirements of all crates. Only when it fails, it would add multiple dependency versions - e.g. brotli v5 and v6 being used by A and B respectively, thus ending up with two duplicated code bases (although there is a chance linker may fix some of that).

Also, the feature flags is "additive" - the dependency will compile with the sum of all features enabled by all crates. So if we make encoder available as a feature (on by default) --- if A uses no-defaults (minimal decoder), and B uses default feature (decoder+encoder), Brotli will compile with both decoder+encoder features -- but it will compile only once, and will be used by both A and B.

P.S. tracking this in https://github.com/dropbox/rust-brotli/issues/209

danielrh commented 4 months ago

Yes you are correct--I still would like to keep the crates distinct...I think it helps people consider including the decompressor as its own thing. If you want to build up a set of commands to merge the repos I can run it and merge that much---while still keeping them separated logically as crates.

nyurik commented 4 months ago

@danielrh it seems there is no easy way to merge the two repos because of something really unexpected (to me) -- both repos started as forks of one, and both had lots of changes done to the same files, having some of them duplicated. I think the easiest would be to simply copy all files in the decompressor repo into /decompressor dir, and just add a comment that the older history of those files are in a different repo.

There might be some other way to do this and preserve the history, but I can't think of one... maybe stockoverflow will know