Open DemiMarie opened 3 years ago
We have considered Rust already, but we have decided to use C++ (see the dnf-5-devel branch in libdnf).
@dmach would it be possible to document the reason for this? IMO, there are significant wins to using Rust, especially when it comes to RPM parsing. The RPM parser in librpm has a history of issues, and I have written a parser in Rust to avoid them.
@DemiMarie honestly, you just suggested rewriting the whole project code base from scratch in a different language, and for description you've provided a single conditional sentence. Most of the major considerations should be obvious and not need to be reiterated here, but let's discuss one major point:
Rust AFAIK (my knowledge is certainly incomplete and several years old) has issues with dynamic linking (I don't have time to research what's the current state of the affair) and libdnf is a system library. What interface exactly would we provide (Rust / C / C++), how would the binaries using it link against it? How would we generate the language bindings for which we now use SWIG?
This is of course just a theoretical discussion, as the TL;DR for the rest of the reasons is: It's just not feasible.
@DemiMarie honestly, you just suggested rewriting the whole project code base from scratch in a different language, and for description you've provided a single conditional sentence. Most of the major considerations should be obvious and not need to be reiterated here, but let's discuss one major point:
@lukash It seems like my question came across as “please rewrite it in Rust”, which wasn’t the intent (and would be very rude to request). That was my fault; I should have phrased the question better. The reason I ask is that I have some libraries (not yet open source, but will be soon) for OpenPGP signature and RPM package validation, and which I believe would be useful in libdnf. This provides some defense-in-depth against future vulnerabilities in GPG and RPM.
These libraries are written in Rust, and can easily be made to expose a C API and ABI for use by libdnf. libdnf would continue to be written in C and C++, and would link to these libraries as if they were written in C. That these libraries are written in Rust is an implementation detail ― they can be linked against and called in the same way as any C library can. I would be willing to do the work to integrate them into libdnf, but I would prefer to only start if there is a chance that the resulting PR would be accepted.
@DemiMarie we consider moving signature verification code to an external library a good idea. On the other hand, we don't want to maintain the library in RHEL (and we would have to, because we'd be the only users of that library). Once the library is available, we could ask the Red Hat teams that work on crypto libraries if they wouldn't maintain it for us.
I don't know if the library could replace gnupg (which executes gnupg2 binaries - and nobody likes that), but if that's the case, more projects could be interested in it and convincing crypto people to support it for us might be easier.
Reopening and changing subject of the ticket.
Just for the record: I've discussed this issue with @pmatilai and @dmnks from the RPM team and they said they would prefer a different language than Rust in order to keep the bootstrap tree reasonably small. Implementing this in Lua which is integrated with RPM already might make more sense from this perspective.
Just for the record: I've discussed this issue with @pmatilai and @dmnks from the RPM team and they said they would prefer a different language than Rust in order to keep the bootstrap tree reasonably small. Implementing this in Lua which is integrated with RPM already might make more sense from this perspective.
That’s understandable, if a bit unfortunate from my perspective. Lua is quite good at logic, but plain Lua (as opposed to LuaJIT) is not very good when it comes to processing massive amounts of binary data.
It would probably make sense to make it a C++ library, as that has a relatively wide usability and allows for a decent degree of reuse.
C++ isn’t memory safe, though. See Chromium’s “doom zone”: parsing untrusted input in a memory-unsafe language with high privilege is a bad idea. Rust avoids this by virtue of being memory safe.
C++ can be written in a similar form to Rust, it's just considerably harder if you're not using those features in the compiler (as Google does not for various reasons). One of the reasons why the DNF 5 project is using C++17 with extremely strict compiler flags is to increase the quality of the codebase as a whole. Also, in general "memory safe" is not well-defined or a useful phrase to use here, especially when Rust crates can still have buffer overflow CVEs.
When you're trying to get people to write "secure code", telling people to rewrite in Rust is not only not productive, it damages your credibility by making it sound like you're being lazy about solving the problem. I know that's not the case with you, but it's really easy for people to perceive it that way.
Alright, I'll bite :grin: let's discuss.
@DemiMarie:
C++ isn’t memory safe, though. See Chromium’s “doom zone”: parsing untrusted input in a memory-unsafe language with high privilege is a bad idea. Rust avoids this by virtue of being memory safe.
Right. But you have to consider the whole picture and evaluate pros and cons. The fact that the library carries the burden of having the Rust compiler ecosystem as a build dep, as well as the added complexity of the C/C++ API wrapper etc. quite outweighs the benefits in a lot of contexts, for a library that (I assume) is meant to be lightweight and low-level. Especially since you can ensure very reasonable safety with recent C++ standards and clean, well-written code.
FWIW I'd be very happy if Rust made its way into low-level system libraries and tools. Just not sure it's the time and place here and now. But, if you can make it happen, more power to you :slightly_smiling_face:
@Conan-Kudo:
C++ can be written in a similar form to Rust, it's just considerably harder if you're not using those features in the compiler (as Google does not for various reasons).
True to an extent (as I've mentioned above), but the distinction is Rust guarantees memory safety by design, while in C++, if you adhere to a lot of rules and are very clean with your code, you have a decent certainty, but that's all.
... Also, in general "memory safe" is not well-defined or a useful phrase to use here, especially when Rust crates can still have buffer overflow CVEs.
I don't know the actual exact definition of "memory safety" we're talking about here, but I'd say Rust's design does define it, i.e. the ways in which it is memory safe (and it covers most, if not all, common memory handling issues that can occur in C/C++). The CVE is likely unfair, it is from what I can see a bug in Rust itself (or perhaps unsafe
code block in the crate, I didn't look into the details, but my understanding is Rust's design does prevent these errors under normal circumstances).
TL;DR while theoretically Rust is the better choice, we need to consider the practical aspects. A library can be as cool and safe as it gets, but if there are practical problems with its adoption, it can very easily turn a success into a failure.
Alright, I'll bite let's discuss.
@DemiMarie:
C++ isn’t memory safe, though. See Chromium’s “doom zone”: parsing untrusted input in a memory-unsafe language with high privilege is a bad idea. Rust avoids this by virtue of being memory safe.
Right. But you have to consider the whole picture and evaluate pros and cons. The fact that the library carries the burden of having the Rust compiler ecosystem as a build dep, as well as the added complexity of the C/C++ API wrapper etc. quite outweighs the benefits in a lot of contexts, for a library that (I assume) is meant to be lightweight and low-level. Especially since you can ensure very reasonable safety with recent C++ standards and clean, well-written code.
FWIW I'd be very happy if Rust made its way into low-level system libraries and tools. Just not sure it's the time and place here and now. But, if you can make it happen, more power to you
Personally, I believe it is the time and place. But if using Rust will lead to the library not being used, then C++ is a better choice.
@Conan-Kudo:
C++ can be written in a similar form to Rust, it's just considerably harder if you're not using those features in the compiler (as Google does not for various reasons).
True to an extent (as I've mentioned above), but the distinction is Rust guarantees memory safety by design, while in C++, if you adhere to a lot of rules and are very clean with your code, you have a decent certainty, but that's all.
Exactly. In Rust terms, all C++ code is unsafe
, while only a small fraction of a typical Rust codebase will be unsafe
.
... Also, in general "memory safe" is not well-defined or a useful phrase to use here, especially when Rust crates can still have buffer overflow CVEs.
I don't know the actual exact definition of "memory safety" we're talking about here, but I'd say Rust's design does define it, i.e. the ways in which it is memory safe (and it covers most, if not all, common memory handling issues that can occur in C/C++). The CVE is likely unfair, it is from what I can see a bug in Rust itself (or perhaps
unsafe
code block in the crate, I didn't look into the details, but my understanding is Rust's design does prevent these errors under normal circumstances).
I believe it was due to unsafe
code in the crate. Saying “Rust code cannot have buffer overflows” is indeed not true, and that is a counterexample. The key advantage of Rust is that safe Rust code cannot corrupt memory (modulo compiler bugs), so the amount of code that one needs to audit is far, far smaller.
TL;DR while theoretically Rust is the better choice, we need to consider the practical aspects. A library can be as cool and safe as it gets, but if there are practical problems with its adoption, it can very easily turn a success into a failure.
I absolutely agree. A clean C++ version that gets used is better than a Rust version that does not.
Another alternative would be to consider EverParse, a toolkit for generating formally verified parsers. That said, it has so many dependencies that even I would have a hard time considering it, unless we were just willing to vendor the generated C code.
@lukash Thank you for your extremely well-reasoned comments.
Update: The code is now public: https://github.com/QubesOS/RPM-Oxide. It is not used in production yet, but the plan is to use it to sanitize RPMs before they are installed in the QubesOS dom0.
Cool. The idea makes sense. I'd be interested in trying to ship this in rpm-ostree today, though it'd need to be something like an optional/experimental flag. We'd need to also ensure that this parser is kept up to date as RPM changes, i.e. having it work would need to be gating somewhere.
I'd also like to look at having an openpgp sanitizer in front of ostree's GPG verification. This might be something like "gpgmers".
Cool. The idea makes sense. I'd be interested in trying to ship this in rpm-ostree today, though it'd need to be something like an optional/experimental flag. We'd need to also ensure that this parser is kept up to date as RPM changes, i.e. having it work would need to be gating somewhere.
How would you like to integrate it?
I'd also like to look at having an openpgp sanitizer in front of ostree's GPG verification. This might be something like "gpgmers".
Would it be possible to use the RPM keyring for that?
Using Rust could significantly reduce attack surface.