ipetkov / crane

A Nix library for building cargo projects. Never build twice thanks to incremental artifact caching.
https://crane.dev
MIT License
900 stars 84 forks source link

`dontStrip = true` increases the closure size by leaving references to build utils #428

Open dpc opened 11 months ago

dpc commented 11 months ago

From the original report

Sorta unrelated and not really a crane issue imo, but I did notice that with dontStrip = true binaries tend to leave behind references to build-time dependencies like gcc and rust-std which significantly increases the closure size of the output. This is especially common with statically built C dependencies (via cargo). You can work around this issue with separateDebugInfo = true, but it's not as convenient for debugging.

by @simonzkl

I think it's worth promoting to a separate issue.

ipetkov commented 11 months ago

I believe nixpkgs has a removeReferencesTo (forget if it's a hook or a package) which can be used to remove arbitrary references to things

The only assumptions we currently make is removing references to the sources we've vendored (which should be mostly safe since Rust projects usually don't rely on being able to find their own sources to things). It would be a lot riskier to try to remove references to something like gcc automatically (in case something is being dynamically linked to it?)...

Definitely wouldn't mind documenting this though!

szlend commented 11 months ago

Conceivably you could removeReferencesTo everything in nativeBuildInputs -- buildInputs. It would be nice if it could be done in a single pass as vendored deps, as doing this over and over again is slow. I don't think you'd want to do this by default though. So if there was a way to specify extra deps you wanted to remove in the removeReferencesToVendoredSources hook, that would be nice. Though you'd probably want to call it something else.

dpc commented 11 months ago

We only need to call it on the resulting binaries no? At least with use-zstd the ./target itself does not hold on to any outside references.

szlend commented 11 months ago

Yeah, just on resulting binaries I think. But removeReferencesTo can only remove a single reference, so if you wanna get rid of all nativeBuildInputs (plus probably propagated ones), it will probably have to do a lot of passes + darwin signing each time. I believe this was already an issue in https://github.com/ipetkov/crane/issues/161, that's why there's a custom implementation in crane.

ipetkov commented 11 months ago

Couple of thoughts:


Regarding efficient removeReferencesTo: I definitely agree it would be beneficial to have an optimized version which can do a bulk removal. The version in nixpkgs is pretty slow (it will remove one input at a time all single-threaded). Whether this is done in crane or upstream (for broader benefit) is worth discussing I suppose...


Regarding behavior when dontStrip = true: IMO if the caller sets this they should opt into exactly what they want to happen to the final result. The default stripping will take care of removing debug info and symbols and what not (which I suspect indirectly removes references to things like gcc and rust-std since they probably show up in debug info). But if someone decides that they know better and want control over what happens, I don't think that crane should still try to strip references to things, esp native/build inputs.

(e.g. How would we actually know that a build input isn't actually a dylib we need at runtime? Questions like that are really easy to answer at the context of a specific project, but really hard for us to make assumptions about here, which is why I lean towards having the actual project opt into what happens if dontStrip = true; is set)

szlend commented 11 months ago

Yeah I definitely agree that crane shouldn't do anything here by default. My thinking was that we could extend the removeVendoredSources functionality to remove other manually specified references as well. I know you can already do this, but not in a single pass. Though upstreaming the optimized solution to nixpkgs probably makes more sense now that you mention it.