Closed ericseppanen closed 3 years ago
The big obstacle here is that a bunch of these transmute
calls are against structs that are auto-generated by bindgen
, and most potential solutions would require things that bindgen can't currently do: apply derive macros (serde
or zerocopy
) and replace types (e.g. bool
-> u8
).
To repeat what I said in other places, I think we have three options:
a) Try to fix bindgen
to do the things we need
b) Post-process the bindgen
output to to the same things as (a)
c) Use the bindgen
output to give us an initial struct, which we paste and modify as needed.
Any of these would be better than what we have today (potential UB).
My opinion is that (c) looks like the least bad option:
bindgen
could do automatically."bindgen
structs right now), ability to add documentation or inherent functions in the same file, etc.I don't love or hate (b). It seems like a "split the baby" strategy that's both annoying to change and improve, but also quite restricting in that we still don't get clean source code that's easy to read, add docs/comments, add impl
s or attributes, etc.
The biggest danger of (c) would be falling out of sync with upstream postgres. But maybe we need trip-wires for those sorts of changes already?
If we had unit tests to verify that our hand-maintained structs are always in sync with the upstream postgres versions (as interpreted by bindgen), then maybe those aren't wasted effort: they help us detect upstream changes that we need to respond to anyway.
Another thoughts: structs that are deliberately non-portable (for compatibility with upstream postgres file formats or network protocols) should be clearly documented as nonportable; maybe even quarantined in a file named non_portable.rs
so we don't forget this.
I agree with what @funbringer said here.
A reduced example:
struct Foo {
x: u16,
y: u32,
}
Rust will consider those 2 padding bytes to be uninitialized memory, and anything that may observe the contents of those bytes is undefined behavior.
So If I transmute
that struct to a byte buffer, and do anything that reads that byte buffer (such as writing it to a file or a network socket), this is UB. Our current code does exactly this, so Miri is correct, and we're in trouble.
I confirmed that encode_pg_control(ControlFileData)
has UB as well. encode_pg_control()
, running under Miri:
error: Undefined Behavior: using uninitialized data, but this operation requires initialized memory
...
= help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
Is it possible to avoid garbage in struct alignment holes by initializing structure in this way:
const SIZE_OF_FOO: usize = std::mem::size_of::<Foo>();
impl Foo {
fn new(x: u16, y: u32) {
let b = [0u8; SIZE_OF_FOO];
let mut foo = unsafe { std::mem::transmute::<[u8; SIZE_OF_FOO], Foo>(b) };
foo.x = x;
foo.y = y;
foo
}
}
Is it possible to avoid garbage in struct alignment holes by initializing structure in this way
Alas, no, it breaks even in simple tests. Here's a snippet:
// rustc +nightly -C opt-level=3
#![feature(bench_black_box)]
pub struct Foo {
a: u8,
b: u32,
}
pub fn complicated_yet_naive() -> Foo {
let buffer = [0_u8; std::mem::size_of::<Foo>()];
unsafe { std::mem::transmute(buffer) }
}
pub fn return_by_mut_ref(val: &mut Foo) {
let a = std::hint::black_box(complicated_yet_naive());
*val = a; // doesn't preserve padding!!!
}
This produces the following asm:
example::complicated_yet_naive:
xor eax, eax
xor edx, edx
ret
example::return_by_mut_ref:
push rax
mov dword ptr [rsp], 0
mov byte ptr [rsp + 4], 0 ; no padding
mov rax, rsp
mov eax, dword ptr [rsp]
mov cl, byte ptr [rsp + 4]
mov dword ptr [rdi], eax
mov byte ptr [rdi + 4], cl ; looks like we've just lost our padding anyway
pop rax
ret
Proof: https://godbolt.org/z/ocEz3Txov
There's no standard or law that would force rustc to preserve paddings during moves, or copies, or whatever. Moreover, to rustc all structs are trivially copyable anyway, even if we don't derive Copy. It decides their layout, so it knows how to optimize those writes out.
If this object is copied then definitely there is no warranty that filler will be copied as well.
The thing is, we can't possibly know if it's being copied/moved or not. The compiler is free to tear, fuse and invent stores, introduce intermediate variables and do other stuff on a whim. For example, I'd expect that fn identity<T>(x: T) { x }
doesn't copy or move anything when inlined at call site.
Is it possible to avoid garbage in struct alignment holes by initializing structure in this way
Alas, no, it breaks even in simple tests. Here's a snippet:
// rustc +nightly -C opt-level=3 #![feature(bench_black_box)] pub struct Foo { a: u8, b: u32, } pub fn complicated_yet_naive() -> Foo { let buffer = [0_u8; std::mem::size_of::<Foo>()]; unsafe { std::mem::transmute(buffer) } } pub fn return_by_mut_ref(val: &mut Foo) { let a = std::hint::black_box(complicated_yet_naive()); *val = a; // doesn't preserve padding!!! }
That's different from the pattern that @knizhnik suggested. Assigning the whole struct is indeed optimized into a memcpy, which copies the padding bytes too, but assigning all the fields individually should leave them unmodified. The problem with doing this in new
is that as soon as the values leaves the function, all bets are off again. But we could use that technique just before converting it to the byte array, something like this:
pub fn encode_pg_control(controlfile: ControlFileData) -> Bytes {
let mut b = [0u8; SIZEOF_CONTROLDATA];
let src = &controlfile;
let dstptr: *mut [u8; SIZEOF_CONTROLDATA] = &mut b;
let dst: *mut ControlFileData = dstptr.cast();
unsafe {
// Assign all fields individually. That leaves the padding bytes zeroed.
(*dst).system_identifier = src.system_identifier;
(*dst).pg_control_version = src.pg_control_version;
// ...
}
// Recompute the CRC
let mut data_without_crc: [u8; OFFSETOF_CRC] = [0u8; OFFSETOF_CRC];
data_without_crc.copy_from_slice(&b[0..OFFSETOF_CRC]);
let newcrc = crc32c::crc32c(&data_without_crc);
let mut buf = BytesMut::with_capacity(PG_CONTROL_FILE_SIZE as usize);
buf.extend_from_slice(&b[0..OFFSETOF_CRC]);
buf.extend_from_slice(&newcrc.to_ne_bytes());
// Fill the rest of the control file with zeros.
buf.resize(PG_CONTROL_FILE_SIZE as usize, 0);
buf.into()
}
The problem with this is that we have to list all the fields by hand. We will have to keep the list of fields up-to-date, it's easy to miss if a field is added upstream. At least we should have an assertion that after copying each field, the struct is equal to the original one per =
. Or maybe we could write a macro to assign each field, and not have to list them by hand? That should be possible - serde does stuff like that at least - but my rust skills are not good enough to write macros.
let src = &controlfile; let dstptr: *mut [u8; SIZEOF_CONTROLDATA] = &mut b; let dst: *mut ControlFileData = dstptr.cast(); unsafe { // Assign all fields individually. That leaves the padding bytes zeroed. (*dst).system_identifier = src.system_identifier; (*dst).pg_control_version = src.pg_control_version; // ... }
(This isn't safe, BTW, because of alignment issues. That will need to be fixed if we adopt this)
I don't think it's wise to add further unsafe
tricks. This kind of code would cause many rust developers to run screaming, and we want to be able to hire those people...
Can we do this in a way that doesn't require rewriting Postgres structs in Rust by hand, and doesn't use unsafe?
Can we do this in a way that doesn't require rewriting Postgres structs in Rust by hand, and doesn't use unsafe?
Earlier I claimed the three options were:
a) Try to fix bindgen to do the things we need b) Post-process the bindgen output to to the same things as (a) c) Use the bindgen output to give us an initial struct, which we paste and modify as needed.
PR #208 implements (c). While this adds a small maintenance burden, it also has benefits (ability to add documentation, compatibility with rust-analyzer, etc.)
(a) is looking fairly intimidating. To drop the unsafe
code we would need bindgen to do the following:
derive
listbool
to u8
zerocopy
or serde
)(b) adding a post-processing step seems like an ugly workaround. The implementation would be fairly straightforward... until you get to "add explicit padding fields". Detecting implicit padding and alignment requirements seems like a fairly tricky heuristic.
I think that the only alternative to FFI is our own mapper of C structs to Rust (or may be customizing FFI - not sure how easy it can be and is it possible at all). Sorry, for been insistent, but I once again want to notice:
I am still unclear on what our future looks like, if/when one of these structs changes.
Should the rust binary only be compatible with a single version of postgres? Or are we expected to be compatible with multiple versions?
I am still unclear on what our future looks like, if/when one of these structs changes.
Should the rust binary only be compatible with a single version of postgres? Or are we expected to be compatible with multiple versions?
The page server should be compatible with multiple Postgres versions. Ideally. If it's not, we can make do, but then we'll need more logic in the management console to manage separate pools of page servers for each Postgres version and use the correct one depending on the compute node's version.
Currently, we're building our stuff over PostgreSQL v14beta and won't need to deal with other versions until about a year from now, when v15 is due. At that point I imagine we'll have postgres_14_ffi
and postgres_15_ffi
and some code to decide which one to use.
At that point I imagine we'll have postgres_14_ffi and postgres_15_ffi and some code to decide which one to use.
Are you suggesting vendor/postgres_14
and vendor/postgres_15
and running bindgen
against each?
Or can we just snapshot the output of bindgen into the postgres_14_ffi
crate? Maybe #208 is inevitable?
On 31/05/2021 20:21, Eric Seppanen wrote:
At that point I imagine we'll have postgres_14_ffi and postgres_15_ffi and some code to decide which one to use.
Are you suggesting |vendor/postgres_14| and |vendor/postgres_15| and running |bindgen| against each?
Yeah.
Or can we just snapshot the output of bindgen into the |postgres_14_ffi| crate? Maybe #208 https://github.com/zenithdb/zenith/pull/208 is inevitable?
Maybe. We would still need the full installations of each supported Postgres version to test them, though. Let's cross the bridge when we get there.
We would still need the full installations of each supported Postgres version to test them, though. Let's cross the bridge when we get there.
If only CI were affected, that would be one thing, but this would add that requirement at build time. That seems like a lot of CPU time to avoid a cut/paste of a few data structures.
We could avoid that for some builds by adding a feature to Cargo.toml for each supported version, but then we have multiple configurations to test & support.
Fortunately, most of Postgres structure we are using in Rust, are not changed during years. May be except pg_control file. It may be reasonable to handle pg_control in special way, so that single instance of pageserver can work with multiple versions of postgres.
(Not really adding any new information here, just clarifying the problem for myself)
I started to wonder why bindgen maps the Postgres 'bool' type to Rust bool, if that's not safe. PostgreSQL 'bool' is a typedef of C99 _Bool
, if _Bool is defined and sizeof(_Bool)==1
. Which is to say, on all non-exotic platforms. _Bool
is defined to have only two valid bit patterns 0x0 and 0x1, just like Rust's bool
.
The problem for us is that there is no guarantee that the input from the C side is valid, and if it's not, you get undefined behavior. Unfortunately bindgen doesn't provide any helpers for dealing with possibly invalid inputs. I think we could work around the bool problem with bindgen options, though, telling it to replace all 'bool's with 'u8'. Or by forcing bool
into unsigned char
in the C headers.
This guide calls bool a "non-robust datatype" because it has those "trap representations" or bit patterns that don't represent any valid value. Interestingly, it also lists floats as non-robust. I'm not sure why, I don't think there are any invalid bit patterns for rust floats. In principle there can be C compilers / architecture that use a non IEEE 754 compatible representation of floats, so some bit patterns might be incompatible on the C side, but we don't care about that and Postgres doesn't support such platforms anyway. And there is f32::from_bits
in the standard library, which just calls transmute, so there really shouldn't be any problem with that.
To recap, we have two slightly different problems when transmuting from bytes to rust struct, and vice versa:
When transmuting from bytes to rust datatypes, if the input contains a 'bool', the high bits of the byte representation can contain garbage. That can trigger undefined behavior in otherwise safe rust code.
When transmuting from rust struct to bytes, the padding bytes are undefined. That can leak arbitrary data from page server's memory.
The zerocopy
crate that Eric used in https://github.com/zenithdb/zenith/pull/208 performs all the safety checks we need. The question then is, can we teach bindgen to generate structs that can be used with zerocopy? There are three problems:
If we could get those options to upstream bindgen, that would be nice. If not, I guess we'll have to hand-write the structs as in PR #208. That leaves us with the problem of maintaining those hand-written copies. I hope we don't need to do this for very many postgres structs.
Why does bindgen behave the way it does? Because bindgen is mostly used only for calling functions in a C library from Rust. In that context, it's perfectly logical to assume that e.g. bool
contains a sane value.
What we are attempting (adopting the native C layout as a network or file serialization format) is not really what bindgen was meant for, because who the heck would ever want that? :)
While playing with bindgen, I realized that "explicit padding fields" plus "custom derives" is probably sufficient.
If there are bool
s in there, then the best approach would be to have some automatically-generated decoder function that returns an error if any of the bool
bytes has an invalid value. That's exactly what serde
does, so I'm starting to think that serde
is a better starting point than using zerocopy
and leaving malformed bool
fields there for someone else to deal with.
I just tried to compile zenith with rust nightly version, while working on something unrelated, and it failed in a very interesting way:
error[E0308]: mismatched types
--> postgres_ffi/src/xlog_utils.rs:335:29
|
335 | fullPageWrites: true, // TODO: get actual value of full_page_writes
| ^^^^ expected `u8`, found `bool`
For more information about this error, try `rustc --explain E0308`.
error: could not compile `postgres_ffi` due to previous error
What's going on here? The CheckPoint struct that this initializes is generated with bindgen from the Postgres sources. With nightly, it's generated as:
...
pub type bool_ = ::std::os::raw::c_uchar;
...
#[repr(C)]
#[derive(Debug, Copy, Clone)]
pub struct CheckPoint {
pub redo: XLogRecPtr,
pub ThisTimeLineID: TimeLineID,
pub PrevTimeLineID: TimeLineID,
pub fullPageWrites: bool_,
pub nextXid: FullTransactionId,
pub nextOid: Oid,
pub nextMulti: MultiXactId,
pub nextMultiOffset: MultiXactOffset,
pub oldestXid: TransactionId,
pub oldestXidDB: Oid,
pub oldestMulti: MultiXactId,
pub oldestMultiDB: Oid,
pub time: pg_time_t,
pub oldestCommitTsXid: TransactionId,
pub newestCommitTsXid: TransactionId,
pub oldestActiveXid: TransactionId,
}
Note the 'bool` as the datatype. With rust 1.52.1 toolchain, it's just plain 'bool'. Looks like the issue with 'bool' was resolved upstream? I'm not sure where and when exactly this change happened, I don't see any mention of 'bool' datatype in the bindgen source tree. But hey, I'll take it :-).
We still have the problems with padding, but this is progress.
@ericseppanen - this item was put into the done category during the milestone v0.2 planning session on 2021-07-30. I'm closing out. Please re-open if I am mistaken.
This is still unfixed; reopening.
This has been resolved in #444, since we no longer have unsafe blocks.
This is a continuation of the discussion from #6, and some related issues that came up in #201 (comment).
There is a pattern that has begun to emerge in postgres_ffi, that looks like this:
This pattern seems unwise in a few ordinary ways (manual computation of struct size, non portability due to byte order and padding, unnecessary use of
Bytes
), but it also has a critical problem:transmute
is likely to lead to UB (undefined behavior).This is not something that should be allowed. Rust UB is orders of magnitude more dangerous than UB in C or C++. The only solution is zero tolerance for UB (or any other unsoundness).
We should always be able to run our code inside Miri (which analyzes rust intermediate-representation for UB, memory unsafety, etc.) Any alarm from Miri should be treated as a critical bug.
Things that have been identified as potential UB in our code:
transmute
to struct containingbool
.transmute
to/from struct containing implicit (compiler-added) padding for alignment.transmute
to/from struct with a length that is not a multiple of its alignment (also leads to padding, when included in arrays or other structs).