Closed GnomedDev closed 6 months ago
Thanks for opening an issue! Yes, binary size is a concern for this library, but only with LTO (because it heavily relies on the actual encoding being known at compile time). Are you measuring bloat with full LTO enabled? If yes, could you provide me reproduction steps? There wasn't any heavy benchmarks in that regard and also no CI to check for regressions, so might be a valid issue :-/
The reproduction I have is that project, but I don't have enough energy to look into this much further. It uses thin lto, but it's probably a bad idea to rely on LTO in the first place. My suggestion would be to feature-lock each encoding to allow dependents to only enable what they need, avoiding the need to rely on LTO.
I'm back from vacation and could take a look. However I'm not able to reproduce. I tried cargo bloat --release
in your repo but I only see:
0.0% 0.1% 16.1KiB data_encoding data_encoding::Encoding::encode_mut
Could you provide me with exact instructions to reproduce (including commit hash). I have an idea that could easily fix the issue but I want to confirm it actually fixes your problem first.
Thanks!
Thanks! I was able to reproduce on your project with cargo bloat -p route-weaver-router --release
. I'll take a look when I get time. I'm quite busy (and a bit sick) at the moment. I'll ping the issue when I have something.
Sorry for the long delay. I tried to #[inline(always)]
the API but the dead code is still not eliminated, so I went with the manual solution of providing features. In the route-weaver-router
reproduction case, this means changing the dependency from:
[dependencies]
data-encoding = "2.5"
to:
[dependencies.data-encoding]
version = "2.6" # not released yet
default-features = false
features = ["std", "base16"]
The code size measured with the text column of cargo size -p route-weaver-router --release
reduces from 2029548 to 1965156 bytes (64392 less). The data-encoding functions measured with cargo bloat -p route-weaver-router --release --crates -n50 | grep data_encoding
reduce from 72.2KiB to 12.6KiB (59.6KiB less).
It is probably possible to do better but it would require quite some refactoring. Would this change be enough for you?
The change is in #101 and I'll keep it open until I decide whether it should be a breaking change or not. It is technically one for those who disable default features.
Actually, I'll probably postpone submitting that PR until I experiment with another idea that would bring much more benefit. I'm just afraid that it might make the API a bit less usable by having many type parameters (a little bit like RustCrypto does). Because if I'll have to make a major version bump, then let's try to make the most out of it.
Ok, my experiment using StaticEncoding<B: Static<usize>>
(and making Encoding
a dyn object) seems to provide similar benefits, but doesn't need features. It might still be a breaking change because I'll have to change the constants from const
to static
. This also would need more work, so not sure how long it will take.
I don't think so. The current design assumes inlining and dead-code elimination to work cross-crate. This seems to work rather well with LTO but not otherwise. The new design I have in mind (I didn't fully test it though) doesn't rely on inlining by adding one layer of indirection (which can be optimized by inlining) and simplifies dead-code elimination by introducing a symbol for each combination, so the linker can always do it regardless of LTO.
Ok, I got some time to look into this again. I've created the v3-preview
branch for now. It will eventually be merged in the main
branch. It essentially provides a v3-preview
feature that provides a v3_preview
module. This module would eventually be the 3.0.0 of data-encoding
(but for now is unstable and uses a higher MSRV).
I did the following change:
diff --git a/common/Cargo.toml b/common/Cargo.toml
-data-encoding = "2.5"
+data-encoding = { git = "https://github.com/ia0/data-encoding.git", branch = "v3-preview", features = ["v3-preview"] }
diff --git a/common/src/noise.rs b/common/src/noise.rs
-use data_encoding::HEXLOWER_PERMISSIVE;
+use data_encoding::v3_preview::HEXLOWER_PERMISSIVE;
For the following measurements:
cargo size | cargo bloat | |
---|---|---|
before | 2093644 | 72.1KiB |
after | 2017828 | 219B |
diff | 75816 | 73611 |
Where "cargo size" is cargo size -p route-weaver-router --release
and "cargo bloat" is cargo bloat -p route-weaver-router --release --crates -n100 | grep data_encoding
.
So this looks pretty good so far. I'll have to finish adding all features to the v3_preview
module and see if I want to add other things. I'll ping here when the v3-preview
branch makes it to main
, then I'll wait a month or so and do a release, at which point I'll probably close this issue.
This is now merged in main
. It will eventually be part of 3.0.0 (tracked by #106) but will probably be released as a "preview" module too (to catch possible usability issues).
Hello, I've just ran cargo-bloat on my project as I'm working on reducing compile times and binary sizes can be a proxy for compile times, and found that
data_encoding::Encoding::encode_mut
is taking up76.9kb
, the third largest function in my entire project (that pulls reqwest, tungstenite, etc).If code size isn't a concern for this project, I'm entirely fine with this being closed, but thought that you might want a heads up that this library is somewhat code-size heavy.