pmodels / argobots

Official Argobots Repository
https://www.argobots.org
Other
125 stars 51 forks source link

Providing an ABT_thread_sleep #29

Open mdorier opened 6 years ago

mdorier commented 6 years ago

It would be great to have an ABT_thread_sleep(double timeout_ms) function that puts the calling ABT_thread to sleep for a given amount of time (or "at least the given amount of time", since other ULTs may be running when the timeout has passed).

halimamer commented 6 years ago

Sounds interesting so I will leave the issue open. Do you have an example where a naive implementation would cause visible inefficiency? My point is that while doing something naive, such as thread_yield inside a loop that will break once the sleeping period has been exceeded, might be inefficient, I can't think of a use-case that will make this a performance bottleneck. If it really is causing inefficiency, then we would fix it by blocking sleeping threads outside the pools and wake them up in O(1) once their time comes

mdorier commented 6 years ago

Yes it's always possible to do an active loop but that consumes CPU. Any possibility to do it without an active loop?

halimamer commented 6 years ago

If you yield then that's not an active loop anymore. But I understand your concern. Waking up a thread unnecessarily just to put it back to the pool because it's sleeping period is not done yet is wasteful. The way to avoid this is by putting threads to sleep outside their corresponding pools similar to how we handle mutexes and cond vars. The challenge, however, is regarding the wakeup step. While for mutexes and other synchronization objects, well defined operations (unlock, cond_signal, ...) would trigger wakeups, with sleep we will need to periodically check the sleeping queue for threads to wakeup. This needs some thinking as to how to implement it efficiently. My first guess is that this queue needs to be global and protected and the wakeup mechanism would be implemented as an event to be checked by ABTI_xstream_check_events

marcvef commented 5 years ago

I currently encounter this issue as well and I also ended up using yield with a loop. It would be great to have an ABT sleep function.

I was wondering though when the yield-loop could become an issue. Let's say an ABT_pool only uses one execution stream and there is a large number of ULTs in that pool which run in the yield loop. Further, a small number of ULTs are not running in the yield loop. Wouldn't this mean that the scheduler is more likely to pick another ULT which is just going to yield again, and therefore delaying those ULTs that are not running in the yield loop? I guess my question is if the scheduler picks ULTs randomly on yield?

dorier commented 5 years ago

The scheduler picks the ULT returned by ABT_pool_pop() call on the scheduler's pool from inside the scheduler. So taking a FIFO pool implementation, when the ULT in the active loop yields, it is pushed back at the end of the FIFO queue, and is only scheduled again once the other ULTs in the pool have been given a chance to execute.

I think you're right that if many ULTs are un an active yield loop waiting simulating a sleep, then the scheduler will go through potentially many of them before finding a ULT that can actually do useful work.