vedderb / bldc

The VESC motor control firmware
2.25k stars 1.38k forks source link

Add one more semaphore-related function. #745

Open lolwheel opened 4 months ago

lolwheel commented 4 months ago

Add sem_drain_and_wait_timeout and time_to_ticks functions to lispBM.

The new sem_drain_and_wait_timeout effectively achieves binary semaphore semantics without relying on ch_binary_semaphore type. It's used in Float package in the other PR that I just sent out.

time_to_ticks is necessary to pass the correct timeout values to sem_wait_to and sem_drain_and_wait_timeout

vedderb commented 4 months ago

Doing a blocking wait inside a syslock sounds a bit dangerous to me. The whole thing looks a bit unsafe in general. Are you sure that you need a binary semaphore? If your control thread blocks somehow while the imu keeps giving samples something strange is going on...

lolwheel commented 4 months ago

Hey Benjamin, thanks for fast turnaround:

Doing a blocking wait inside a syslock sounds a bit dangerous to me.

Nothing dangerous here, this is exactly how chSemWaitTimeout (no S suffix) is implemented itself: https://github.com/vedderb/bldc/blob/68094e9cb7e43e4ed9f2866443f7f60366f321ee/ChibiOS_3.0.5/os/rt/src/chsem.c#L238-L240

Are you sure that you need a binary semaphore? If your control thread blocks somehow while the imu keeps giving samples something strange is going on...

Agreed with the sentiment but there is no reason not to be defensive, binary semaphore semantics will strictly make Float package more resilient.

In practice I've only seen 0 or 1 overruns total. 1 overrun I've seen was happening during start of the Float package because the callback is registered before the Float pkg thread is started.

vedderb commented 4 months ago

Agreed with the sentiment but there is no reason not to be defensive, binary semaphore semantics will strictly make Float package more resilient.

More resilient against what? To me it looks like the new suspicious function is vastly more likely to cause trouble than a possible overrun (an overrun that can't happen for no reason btw).

In practice I've only seen 0 or 1 overruns total. 1 overrun I've seen was happening during start of the Float package because the callback is registered before the Float pkg thread is started.

What about starting the thread first and then registering the callback? The thread can even start by waiting for the semaphore, so that nothing is done before new data is available if you care about the very first sample.

lolwheel commented 4 months ago

More resilient against what?

I mean it's threading, it's really hard to reason about. As an example there might be another high priority thread that hogs the CPU and because of this the Float thread doesn't get to run in time to process the IMU data. It might be the case that there are no threads like that currently in the codebase. Yet having this functionality future-proofs things.

Another thing to consider is that having a tool that semantically matches your needs is important. The need I have is for one thread to raise a flag and another one to react to the presence of this flag. It doesn't matter to me how many times the flag is raised. Reacting more than once to this flag is wrong. This is semantics of Condition variables but implemented with a semaphore.

In the end you're the owner so I'll follow your opinions on this but I kindly ask you to consider the arguments above.

To me it looks like the new suspicious function is vastly more likely to cause trouble than a possible overrun (an overrun that can't happen for no reason btw)

I'd be happy to address any specific concerns you have, if any. I'm not sure if it's possible to write unit tests against things such as locking but I could give it a shot if you can point me to an example.

What about starting the thread first and then registering the callback? The thread can even start by waiting for the semaphore, so that nothing is done before new data is available if you care about the very first sample.

Yes totally, that is the way to go I think.