Open mmasque opened 2 months ago
Thank you for reporting an issue!
What is the stack trace? Signal code?
In my opinion, if it works somewhere else, when only the architecture is changed, no segfaults should happen as well and no different behaviour should be introduced. This is an API with a well-defined interface and behaviour.
Regarding the different thread ids, regardless of the id, the implementation shouldn't do anything else except for returning an error code like -EINVAL
.
The reason I am telling all of this is that I suspect a bug might be in the pthread/kernel code, which, for some reason, has way too different branches for these two architectures.
And may I ask where did you find the information about passing 0
as the thread id when we want to set the deadline policy? Also, in your example you are setting SCHED_FIFO
...
Here's what I get from gdb. Let me know how I can provide more detail, I am not experienced in C.
(gdb) run
Starting program: /tmp/test-thread-priority
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Program received signal SIGSEGV, Segmentation fault.
0x0000ffff9a3b7c08 in pthread_setschedparam () from /lib64/libc.so.6
(gdb) backtrace full
#0 0x0000ffff9a3b7c08 in pthread_setschedparam () from /lib64/libc.so.6
No symbol table info available.
#1 0x0000aaaab1997a34 in thread_priority::unix::set_thread_priority_and_policy (native=0, priority=...,
policy=...) at thread-priority/src/unix.rs:589
params = libc::unix::linux_like::sched_param {sched_priority: 1}
fixed_priority = 1
#2 0x0000aaaab199722c in test_thread_priority::main () at src/main.rs:11
No locals.
Regarding setting thread_id = 0
: I took it from this test https://github.com/iddm/thread-priority/blob/1135f2ec40d890884b29486d3b15b567eca44a49/src/unix.rs#L879. Indeed, setting the thread id to 0 works when using a deadline policy, but breaks (with segfault on my arm device, and with an OS Error 3 on my x86 device) when using a Fifo policy. Conversely, using thread_native_id()
works for Fifo
but fails for Deadline
policy with Os Error 3. But perhaps this should be a separate issue/discussion.
There is an assertion that it is valid, meaning it is positive in this case. Where did you get the information that it should be 0
? Are you using glibc or musl or any other version of libc on your systems? And what are their versions?
Regarding the thread id being set to 0: the test linked in the previous reply sets the thread id to zero with the comment //current thread
. The documentation of sched_setattr
, the syscall used for deadline policy, says that if pid equals zero, the scheduling policy and attributes of the calling thread will be set. I do not know if setting thread id to 0
is also meant to work for pthread_setschedparam
called by Fifo
, but it should not segfault, it should return ESRCH according to its documentation.
ldd (Buildroot) 2.36
is what I get out of ldd --version
.
Regarding the thread id being set to 0: the test linked in the previous reply sets the thread id to zero with the comment
//current thread
. The documentation ofsched_setattr
, the syscall used for deadline policy, says that if pid equals zero, the scheduling policy and attributes of the calling thread will be set. I do not know if setting thread id to0
is also meant to work forpthread_setschedparam
called byFifo
, but it should not segfault, it should return ESRCH according to its documentation.
ldd (Buildroot) 2.36
is what I get out ofldd --version
.
Indeed! But I am, unfortunately, not responsible for the logic in the other libraries (in the context of this project). I might take a look at the implementation of your libc later, but cannot guarantee. The code you posted should never lead to a crash, and it doesn't lead to it in the Rust part, so the crate doesn't do anything wrong here, I think. Do you think it does?
Thanks for the documentation link about "zero", I somehow overlooked it.
I agree about the seg fault not being the responsibility of this crate indeed. Perhaps, since the value of thread_id is handled differently depending on which Policy is passed, this should be made explicit in the ThreadId struct or documentation?
Okay, I see where your confusion comes from.
For the deadline policy we use the sched_setattr
, not pthread_setschedparam
. For sched_setattr
having 0
as thread id is legal, also, please, note that the type of the thread is not pthread_t
but pid_t
.
So, if I may, you read the documentation page of one thing and applied its logic to another, completely different function with different arguments and types. :-)
Perhaps, since the value of thread_id is handled differently depending on which Policy is passed, this should be made explicit in the ThreadId struct or documentation?
Yes, we can have special checks for the policy and the passed thread id, as the code branches there for making either a syscall or a call to the pthread
library.
The following example code yields a segmentation fault on an aarch64 Linux machine. I confirmed that the segfault occurs on the
unsafe
execution oflibc::pthread_setschedparam
. The code succeeds on an x86 machine.Note also that
set_thread_priority_and_policy
has varying requirements for its ThreadId, which depend on the type of policy passed: for setting deadline on the current thread, 0 should be used as the id, whilethread_native_id()
should be used when setting a thread to the deadline policy. This should be documented or changed to avoid confusion, in my opinion.