ysbaddaden / execution_context

4 stars 1 forks source link

EventLoop: timeout select action is thread unsafe (IOCP/libevent) #26

Open ysbaddaden opened 1 month ago

ysbaddaden commented 1 month ago

There's an MT error in the IOCP & libevent event loops in stdlib, when resolving select timeouts.

IOCP checks for event.timeout? and fiber.timeout_select_action together, and enqueues the fiber otherwise, while libevent only checks for fiber.timeout_select_action and enqueues otherwise.

Both don't enqueue the fiber when timeout_select_action is set but fail an atomic CAS.

Failing scenario

I believe this is only working in the current crystal releases because a fiber is tied to a thread and each thread has its own event loop, same if the channel sender/receiver fiber is resumed: they will be executed concurrently, not in parallel, so they can't fail.

But break any of these two assumptions (as we do in ExecutionContext) and :boom:

ysbaddaden commented 1 month ago

Given that we're dealing with threads, we very likely need Fiber#timeout_select_action to become an Atomic(Channel::TimeoutAction?).