tcdi / plrust

A Rust procedural language handler for PostgreSQL
PostgreSQL License
1.09k stars 32 forks source link

Being stuck on Rust 1.61.0 is unacceptable #127

Closed workingjubilee closed 1 year ago

workingjubilee commented 1 year ago

We are rapidly approaching (really, have passed already several times over) the point where it does not make sense to continue pinning a version, even for the convenience of anyone's security processes.

  1. rustc 1.61.0 is not supported from a security perspective, as it is not the latest version. If there are any showstopping security bugs, the Rust Project will not fix them. So there will never be a rustc 1.61.1. The way the Rust Project tends to spell "security patch" is "next train" and thus the patch release for rustc 1.61.0 is actually 1.62.0.
  2. By not upgrading, we are passing on critical fixes to rustc's handling of the aarch64 call ABI, and are essentially hoping and praying that, if deployed on aarch64, we don't hit it.
  3. Some versions of rustc introduce relatively major features and programming conveniences to stable usage. 1.65 is particularly notable for introducing the initial version of generic associated types. These are very useful for complex ownership lifetime situations, the classic one being an iterator that yields elements with a generic lifetime, instead of all the elements being bound by a single lifetime decided by the iterator. While a somewhat abstract-sounding feature, complex lifetime situations unfortunately happen all the time in FFI because, by definition, you have entities that can have "two owners" conceptually and are thus bound by two lifetimes. This tends to result in situations that, in current-day Rust, are often solved by simply brute-forcing something using unsafe code, instead of modeling the actual lifetimes involved. By holding back the version pinned for PL/Rust, this is causing upstream annoyance in PGX. I had hoped to be able to use these to handle some issues with generating the SQL entity graph introduced by the new approach which would allow removing various special cases. It also probably limits the soundness of improving allocations.
  4. Same as 2, but more specifically with core::ffi::CStr. It allows removing dependencies (core_cstr), so it reduces the amount of code that anyone has to worry about in the supply chain, but by pinning to 1.61 we are losing that opportunity.
  5. Same as 2, but more specifically with std::thread::scope. It allows removing a large dependency in the build system for pgx-pg-sys, the rayon dependency which brings in about a dozen other crates, so it significantly reduces the amount of code that anyone has to worry about in the supply chain, but by pinning to 1.61 we are losing that opportunity.
  6. tcdi/plrust#118 will not actually be fully solved with anything less than rustc 1.63.0, as #[link_section] is not independently detected as unsafe until then, only #[no_mangle] but that is, again, still an unsupported version in security terms, so realistically, moving from an unsupported version to an unsupported version is not really acceptable.
  7. Compile times have been raised as a concern, and by pinning we are passing on significant compile time improvements. If it takes 24 hours to compile a "hello world" in PL/Rust on a direly under-provisioned machine, then that's still from 43 minutes to 2 hours and 24 minutes in savings. And due to the issues like needing to build rayon, the savings may be larger.
  8. It's just tedious to work on fixing things upstream for PGX and then have to go to PL/Rust and reinstall cargo pgx for the specific pinned rustc version. These pointless rebuilds slow down every single patch I might write for PL/Rust. It can easily consume an hour or more on pure faffing about when switching back and forth, partly assisted by sometimes forgetting and only hitting the secondary issues caused by not doing so, thus having to debug it, and only then remembering to rebuild and reinstall, etc. This is already a test suite that takes too long to run and debug, this makes it even more unconscionable.
  9. It is very likely we are going to find another issue like tcdi/plrust#118, or want to try to actively find others that use similar ideas but don't rely on that specific attribute, that are not yet covered by rustc. That may warrant me or someone else studying the unsafe detection upstream in rustc and submitting patches for what we find, and that would only come to stable ~2 versions after it merges.
  10. It makes it impossible to develop a process for smoothly upgrading postgrestd to new Rust versions if we are not actually allowed to upgrade to new Rust versions.
  11. I can't even imagine what sorts of things we are giving up, honestly, by not having access to std::backtrace::Backtrace. It's intended to be a key component of newer, more sophisticated and better-scoped error-handling in Rust. Even if it's mostly going to be a diagnostic improvement, it's a lot to not have.
  12. This last point is a stand-in for all the concerns that I have quietly sublimated instead of actively thinking about because this has been presented as a non-negotiable requirement.
workingjubilee commented 1 year ago
  1. It's becoming really hard to avoid using more recent features unintentionally while improving the unsafe code in pgx, because a lot of the work that has been done to make our pointer abstractions better has been entirely recent. And by "recent", I mean "it was stabilized 6 months ago". While working to make pgx::datum::Array<'a, T> safer, more correct, and eventually faster, I accidentally used a library feature stabilized in Rust 1.63[^1], to which I say: lmao.
  2. Our test framework was improved using a const fn stabilization in Rust 1.63, that's https://github.com/tcdi/pgx/pull/833, which uses Mutex::new() becoming stable in const fn to avoid having to use a Lazy (from another dependency! once_cell) just to initialize with Mutex::new(Vec::new()) (which is just a Mutex around a nullptr and two 0s).
  3. In order to circumvent these two recent issues, I have pinned the dependencies for building PL/Rust, which is considered by many esteemed maintainers to be a security malpractice. This is exacerbated in Rust as Rust's usage of static linking inhibits simply replacing the underlying components, thus the only way to assure things build with security issues resolved is to keep moving support forward. The way we are using a rust-toolchain.toml is also, effectively, pinning dependencies.

[^1]: Per the tracking issue for #![feature(slice_ptr_len)], it's almost impossible to get the len of a pointer to a slice correctly without NonNull<[T]>::len() so I didn't have many options here.