rust-lang / libc

Raw bindings to platform APIs for Rust
https://docs.rs/libc
Apache License 2.0
2.12k stars 1.04k forks source link

Provide a way to const-construct a `sem_t` #2015

Open Diggsey opened 3 years ago

Diggsey commented 3 years ago

(mem::zeroed() is never going to be const, and MaybeUninit::zeroed().assume_init() is still not const)

djarek commented 3 years ago

POSIX allows semaphores to be implemented using file descriptors. Unless I'm missing something, this means that there's no portable way of doing const initialization of sem_t (even in C and C++).

Diggsey commented 3 years ago

You would still need to call sem_init before you start using the semaphore. This is just to allow declaring a static variable of the appropriate type.

djarek commented 3 years ago

What's wrong with this?

static mut sem: MaybeUninit<libc::sem_t> = MaybeUninit::uninit();

And to initialize it you use:

let ret = unsafe { libc::sem_init(sem.as_mut_ptr(), 0, 0) };

Obviously, it's not very nice to use, but I don't think that anything better can be provided, because sem_t doesn't seem to have any (public) sentinel values. So any value is garbage unless it came from sem_init call.

Diggsey commented 3 years ago

That is UB as soon as you have more than one thread accessing sem because as_mut_ptr takes a mutable borrow of self. To get this to work you actually need:

static mut sem: UnsafeCell<MaybeUninit<libc::sem_t>> = UnsafeCell::new(MaybeUninit::uninit());

(Note: static mut is still needed because the target is not Sync)

Furthermore, you cannot use the methods on MaybeUninit to get at the inner value at all, because they again take &mut self, so you have to do pointer casts or transmutes from *mut MaybeUninit<libc::sem_t> to *mut libc::sem_t on every access to the semaphore.

I don't think I need to explain why this is terrible...

maxbla commented 3 years ago

@Diggsey It sounds like you actually want a static sem_t, not a const sem_t (you don't need one for the other). If so, this is the domain of lazy_static or once_cell. Specifically,

lazy_static! {
    static ref SEMAPHORE: libc::sem_t = {
        let mut semaphore: MaybeUninit<libc::sem_t> = MaybeUninit::uninit();
        libc::sem_init(semaphore.as_mut_ptr(), /*other args*/);
        unsafe {semaphore.assume_init()}
    };
}

std::mem::zeroed() might be a better option than MaybeUninit, since you could avoid the unsafe.

Hope that helps.

Diggsey commented 3 years ago

@maxbla No, that is just a hacky workaround that won't suffice for my use-case (accessing the semaphore from a signal handler).

Only certain operations on a sem_t are guaranteed to be async-signal-safe, crates like lazy_static and once_cell are not guaranteed to be async-signal-safe, and so should not be used from a signal handler.

There's no reason why sem_t cannot be const-constructible (it is in C/C++).

maxbla commented 3 years ago

Yeah, rust doesn't generally play nicely with async signal safety. Luckily, transmute is const, so I think you can still accomplish what you're after without any additions to libc.

// AlignedSem has the same alignment and size as a sem_t
#[repr(C)]
union AlignedSem {
    array: [u8; size_of::<libc::sem_t>()],
    sem: libc::sem_t,
}

// particularly unsafe - subverts detection of use-before-initialize bugs
static mut MUTEX: libc::sem_t = unsafe {
    std::mem::transmute(AlignedSem{array: [0u8; size_of::<libc::sem_t>()]})
};

fn main() {
    unsafe {libc::sem_init(&mut MUTEX as *mut _, 0, 1)};
    // register signal handlers here
}

Rust really makes this hard - the need for the AlignedSem type is unfortunate.

Diggsey commented 3 years ago

unsafe {libc::seminit(&mut MUTEX as *mut , 0, 1)};

As mentioned, creating a mutable borrow to the static variable is UB, since it can be accessed by multiple threads.

maxbla commented 3 years ago

You would still need to call sem_init before you start using the semaphore. This is just to allow declaring a static variable of the appropriate type.

I did this

mem::zeroed() is never going to be const, and MaybeUninit::zeroed().assume_init() is still not const

I made a const, zeroed sem_t in the static

As mentioned, creating a mutable borrow to the static variable is UB, since it can be accessed by multiple threads.

As I constructed it, that mutable borrow was the first line in main. As long as you call sem_init() before any threads are spawned, UB is not possible. Hey, while you're at the start of main, make a *mut libc::sem_t that you can alias instead of &muts (which is not UB).

let mutex_ptr = unsafe {&mut MUTEX as *mut _};
// using mutex_ptr might alias *mut, but that is allowed
let res = unsafe{libc::sem_init(mutex_ptr, 0, 1)};

sem_post() is async signal safe, so you can safely clone mutex_ptr into multiple threads and sem_post() it. I don't see that being too useful, so I have a feeling you want to to sem_getvalue, which isn't async signal safe.

Alternatively, use a thread_local! to avoid aliasing except by signal handlers.

It sounds like everything you asked for is possible, just the specific examples of code given to you were not suitable for your use case. If you either wrote a minimal example of what you're doing in C or provided an example of an api libc could add that would do what you need, this issue would be actionable.

Diggsey commented 3 years ago

@maxbla there is a workaround, which I presented here: https://github.com/rust-lang/libc/issues/2015#issuecomment-753554320

The idea of storing the *mut sem_t in a separate static mut is one I had not thought of - that would also work, and would eliminate the need for an UnsafeCell at least.

That said, the fact that these workarounds are so obscure, and that it's so easy to accidentally get UB, is reason enough to add a const-constructor to sem_t.

Furthermore, libc is supposed to provide the same functionality as is available in C/C++ and in those languages the sem_t type is const-constructible.

maxbla commented 3 years ago

The idea of storing the *mut sem_t in a separate static mut is one I had not thought of - that would also work, and would eliminate the need for an UnsafeCell at least.

mutex_ptr can't be static since it isn't Sync. The best way is probably to use a static UnsafeCell<libc::sem_t> with the silly AlignedSem/transmute trick I used. For anyone reading just this comment, the final result is

// AlignedSem has the same alignment and size as a sem_t
#[repr(C)]
union AlignedSem {
    array: [u8; size_of::<libc::sem_t>()],
    sem: libc::sem_t,
}

// SAFETY: see std::mem::zeroed()
static mut MUTEX: UnsafeCell<libc::sem_t> = unsafe {
    let zeroed_sem = std::mem::transmute(AlignedSem{array: [0u8; size_of::<libc::sem_t>()]});
    UnsafeCell::new(zeroed_sem)
};

fn main() {
    unsafe {libc::sem_init(MUTEX.get(), 0, 1)};
    // register signal handlers here
}

libc is supposed to provide the same functionality as is available in C/C++ and in those languages the sem_t type is const-constructible.

libc is supposed to be a low-level binding to system types, constants, and functions. The fact that it provides the same functionality is merely a consequence of this. For example, C doesn't have a function to statically zero memory, it has a language feature - that all global and static variables are automatically initialized to zero, but also {0}, as in sem_t semaphore = {0}. By this criterion, libc should not include such a feature (but unsafe Rust should).

the fact that these workarounds are so obscure, and that it's so easy to accidentally get UB, is reason enough to add a const-constructor to sem_t

If you're writing signal handlers (in Rust or C), you need some obscure knowledge. Hopefully std::MaybeUninit::zeroed becomes const soon and the required knowledge becomes less obscure. For now, no other type in this crate has its own special special const constructor and I don't see what makes sem_t special. This seems more like the territory of nix.

Diggsey commented 3 years ago

mutex_ptr can't be static since it isn't Sync.

Luckily static mut doesn't need to be sync, and I would only need read-access to the pointer from the signal handler :)

For now, no other type in this crate has its own special special const constructor and I don't see what makes sem_t special

True, but sometimes language limitations require library-level solutions. TBH, I would be happy if all types defined by libc had a const fn new method.

maxbla commented 3 years ago

I think this should be closed in favor of https://github.com/rust-lang/rust/issues/62061

DoumanAsh commented 3 years ago

Some bit of self-ad. I have crate to wrap platform APIs for semaphores https://github.com/DoumanAsh/semka It provides const initialization as unsafe function and requires you to perform init when you need to use it.

sweihub commented 2 years ago

@DoumanAsh Thanks for your self-ad semaphore, however it's a local one not inter-process, will you add an inter-process one?

RalfJung commented 8 months ago

mem::zeroed() is never going to be const

In fact it is const since Rust 1.75. :)