memorysafety / zlib-rs

A safer zlib
zlib License
122 stars 10 forks source link

Please use cargo-c to produce the cdylib #174

Open lu-zero opened 2 weeks ago

lu-zero commented 2 weeks ago

I can provide you with a PR, cdylibs do require some arch specific logic and cargo-c streamlines the whole process and provides also a pkg-config file and a platform-correct way to install the package.

folkertdev commented 2 weeks ago

That's very interesting. I do want to keep the ability of being able to build just with cargo itself but can set cargo-c up quickly that would be helpful.

some notes

lu-zero commented 2 weeks ago

That's very interesting. I do want to keep the ability of being able to build just with cargo itself but can set cargo-c up quickly that would be helpful.

some notes

* "Do not specify the crate-type, cargo-c will add the correct library target by itself." clashes with still being able to use standard cargo. Will anything fail if we continue to specify `cdylib` as the crate type?

Nothing should fail, but you can use cargo-c directly on the libz-rs-sys crate. One of the use-case of cargo-c is to run it directly on the main crate and use the capi feature to automatically add the C API, but nothing prevents it to be used on a dedicated crate.

* I don't think we need to generate the `.h` file: the whole point is that the standard zlib.h file should work. In that case, can we skip the `cbindgen.toml` configuration? (we'll verify our API compatibility in other ways).

Yes, you can pass a pre-made header file.

* is there a comprehensive list of the sort of detail that `cargo-c` takes care of? is https://dev.to/luzero/building-crates-so-they-look-like-c-abi-libraries-1ibn still up to date? (plenty of things changed, e.g. about symbol mangling since that post)

The README should cover everything and should be up to date.

Maybe it is time to write another blogpost :)

folkertdev commented 2 weeks ago

cool, that mostly looks good, however, in our case, we do need the extra crate that generates the cdylib because it configures different features (notably it uses malloc/free instead of including the global rust allocator for allocation). That crate is also a convenient place to put documentation for how to build the dynamic library.

I can also just do that myself though, this setup is pretty simple. thanks for that!

Finally, from a supply chain perspective, how can we be sure that cargo c doesn't mess with the contents of our dynamic library (e.g. includes some nefarious code)? I have no reason to believe that it does, the rustls-ffi project using it boosts my confidence, but is this something you've thought about?

lu-zero commented 2 weeks ago

Depends what is the scenario you have in mind and how you'd verify.

If you wonder if you can compare the libraries produced with the output of cargo, you can (by passing the right linker flags). cargo-c mainly streamlines what cargo can do with some fair amount of effort, so what you can use to ensure you aren't using a compromised cargo can be used to ensure the cargo-c isn't compromised likewise.

In theory you can build the libraries with cargo-c and cargo, disasm them and diff the outputs and confirm what's added is only the soversion and such, as long the code built is reproducible.

The code is relatively straightforward, so people willing to inspect it and check the differences between releases can do that quite quickly :)

If you have suggestions in this regards I'm all ears :)

folkertdev commented 2 weeks ago

hmm yeah I'll have to discuss this with some colleagues.

In any case, I think a step in the right direction would be to actually split the core of the logic from the parsing of the input, much like https://github.com/lu-zero/cdylib-link-lines/blob/master/src/lib.rs already tries to do (although I'd want just theshared_object_link_args part, and then do any IO myself).

Lots of code here https://github.com/lu-zero/cargo-c/blob/master/src/build.rs (which I believe is the majority of the logic) actually deals with parsing inputs, providing defaults if they don't exist and erroring out when invalid.

Having a pure (as in no side-effects like reading environment variables, writing to files or executing shell commands) library is much easier to review than lots of IO and validation mixed with the logic. It would then also properly be possible to write your own build.rs with that easily-verified side-effect free library.

sdroege commented 2 weeks ago

cargo-c is not very different from the code you'll find inside cargo itself, or rustc's linking logic for that matter. Like every build system it's a mess of environment variables, shared mutable state and side effects, workarounds for strange platform behaviour and toolchains, etc. but the point is that it's all isolated in a single place instead of you and everybody else having to re-invent it from scratch every time.

Ideally what cargo-c does should be part of cargo (e.g. cargo can't create proper C shared libraries by itself because it's missing versioning information, and also can't install them unlike executables) but the cargo project considered this out of scope for now and that's why cargo-c exists on top of cargo to put all these annoying details in a single place.

So your choices here are IMHO

FWIW, we're successfully using cargo-c since quite a while in GStreamer, and e.g. librsvg is also using it (to produce a C-style library of something that was originally written in C), my port of ebur128 uses it for a drop-in replacement of a C library (and like in your case it uses the original header instead of generating one via cbindgen).

Many Rust projects that provide some kind of C library but don't use cargo-c are often hard to use because they're missing one or another of the required steps on top of cargo's defaults on some platform or another.