dropbox / rust-brotli

Brotli compressor and decompressor written in rust that optionally avoids the stdlib
https://dropbox.tech/infrastructure/-broccoli--syncing-faster-by-syncing-less
BSD 3-Clause "New" or "Revised" License
818 stars 83 forks source link

Prevent exposing internal APIs as public #209

Open nyurik opened 5 months ago

nyurik commented 5 months ago

Currently this crate exposes every internal implementation detail as public API, making it almost impossible to do proper releases -- any internal change is seen as a public API change - thus breaking semver guarantees, forcing brotli crate to be published with a major version change. This fragments the community because cargo is unable to simply pick the later version of the crate for all usages in the same project.

To solve this, I would like to go through all of the APIs, and mark ALL of them as internal (crate-level), and then re-enable just the ones we explicitly want to publish. I am not sure if there is a tool for this, so I might have to go to the biggest users of the crate and see what APIs they use.

image

nyurik commented 5 months ago

I have tried to replace all pub with pub(crate) to see what would break, so that brotli crate can have minimum public API. It seems there are several problematic areas with different solutions.

Re-exporting brotli_decompressor

The dropbox/rust-brotli-decompressor crate is a separate git repo. Some decompressor API gets pub re-exported, and likely has broader surface than needed. Re-exporting traits/structs cannot limit which functions in an impl get re-exported - so we end up with a full API. I propose to move brotli-decompressor code into this repo as the first step, possibly following by joining two crates into one. The benefits:

Using internal API in bin tests and utils

Many internal APIs seem to be available to the bin/* executables, e.g. concat::BroCatli and brotli::enc::threading::AnyBoxConstructor in bin/utils.rs. I am not certain if some of these have legitimate end-user usage, but it seems a large number of them is mostly used for research and integration testing. As such, they should not be exported as pub, but instead rely on tricks like this to re-export all internal API to integration tests when compiled for tests. To solve this, we need to refactor all bin/* files - making sure to only keep the real end-user API. In the mean time, we may try to use a similar trick to create feature-gated "internal api" re-exports.

#[cfg(test)]
mod test_helpers {
    pub use super::internal::*;
}