Process wakes up every second (1000 milliseconds)

pavlix commented 6 years ago

I am now testing Python's async/await features and I chose Curio for my first experiments. I am using a simple example (echo server from the curio docs), checking its behavior using strace` and the process wakes up every second instead of sleeping until a new client connection is available. I am working with async I/O in C and have knowledge of the Linux kernel and its APIs and I'm ready to help and participate on the project occasionally.

Expected behavior:

When there is no new connection available nor new data available on an existing connection, the server should not wake up.

Actual behavior:

Server wakes up every 1000 milliseconds.

Additional notes:

The server calls epoll_wait(3, [], 2, 1000) where 1000 stands for 1000 ms of timeout.
I noticed the nonblocking socket is set using FIONBIO rather than SOCK_NONBLOCK/O_NONBLOCK. Is that intentional?

Relevant lines from strace:

socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC, IPPROTO_IP) = 6                   
ioctl(6, FIONBIO, [1])                  = 0                                 
setsockopt(6, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0                         
bind(6, {sa_family=AF_INET, sin_port=htons(2500), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
listen(6, 5)                            = 0                                 
accept4(6, 0x7ffccb1b3170, [16], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
epoll_ctl(3, EPOLL_CTL_ADD, 6, {EPOLLIN, {u32=6, u64=94704028876806}}) = 0  
epoll_wait(3, [], 2, 1000)              = 0                                 
epoll_wait(3, [], 2, 1000)              = 0                                 
epoll_wait(3, [], 2, 1000)              = 0

dabeaz commented 6 years ago

Non-blocking I/O is set on sockets using the socket.set_blocking(False) method. Whatever that uses to do it is what's used.

Regarding time, the behavior you're seeing is expected. Discussion of time and timeouts can be found in the file curio/timequeue.py

I suppose some optimizations could be made to that, but it hasn't come up.

pavlix commented 6 years ago

Thanks for the information about socket.set_blocking(False) and reference to curio/timequeue.py. I read the comments and it makes sense to me that you don't want to sort the list of all timers especially in a timer heavy applications.

I would eventually like to propose a patch that would fix some of the corner cases:

An application with exactly zero timers (e.g. a server currently only listening to new connections to accept) should in my opinion use infinite timeout instead of one second.
I understand you do not want to compute the exact time to the closest timer. On the other hand if the closest scheduled timer is e.g. six hours away (e.g. for a DHCP client's lease renewal) I would probably expect a better approximation of the timeout than degrading the couple of hours to a second.

What do you think?

Note: You mention the kernel mechanism in the comments section. I haven't checked what exactly would the Linux kernel do the timers were offloaded to the kernel using epoll and timerfd.

Dobatymo commented 6 years ago

Most async io type libraries show this behavior to be able to handle other signals like ctrl-c. Try using pure asyncio. It doesn't consume idle resources, however it doesn't react to keyboard interrupts by default as well.

njsmith commented 6 years ago

Note: You mention the kernel mechanism in the comments section. I haven't checked what exactly would the Linux kernel do the timers were offloaded to the kernel using epoll and timerfd.

Even if you only care about Linux, each timerfd only tracks a single timer. So unless you want to create a separate timerfd for every timeout (which you don't), then you'd still need algorithm for computing the nearest timeout so you could pass it to timerfd_settime, which is what Dave's doing with his timer wheel and its once-a-second ticks. There are some places where timerfd is super useful (e.g. it can track wall-clock alarms, and set timers to wake the system from suspend), but basic timeout tracking in an event loop isn't really one of them.

Most async io type libraries show this behavior to be able to handle other signals like ctrl-c

I'm not aware of any other async libraries that show this behavior – can you given an example? It's certainly not required to handle signals like control-C. E.g., trio doesn't wake up every second, and it still handles control-C correctly on all platforms. And I believe that asyncio issue you're referring to is a Windows-specific bug that would be easy to fix but no-one's gotten around to it.

OTOH it's a known thing for some kinds of optimized timer-wheel implementations to require periodic bookkeeping, and I don't know that 1 wakeup/second is really harming anything. Though there are data structures that claim to handle timeouts efficiently without requiring any regular wakeups, e.g. http://25thandclement.com/~william/projects/timeout.c.html

dabeaz commented 6 years ago

In the big picture, I'm not so concerned by periodic ticking. For all practical purposes, 1 second might as well be infinity on modern hardware. Perhaps if Curio was running on some device involving extreme low-power requirements, I'd reconsider, but if your primary concern was power, you wouldn't be coding in Python in the first place.

The main unknown in the current implementation is the timeout value of 1 second. Certain algorithms such as distributed consensus algorithms (Raft, Paxos), often set a lot of timeouts in the range of a few hundred milliseconds. So, maybe a tick of 100ms might make more sense.

I will take a look at what Curio does now to see if any simple optimizations can be made. For example, it probably doesn't have to tick if there are no pending timeouts.

Dobatymo commented 6 years ago

I'm not aware of any other async libraries that show this behavior – can you given an example?

Sorry, my comment was about the behavior on Windows. On Windows other libraries like twisted and gevent show the same behavior (and handle ctrl-c correctly). Tornado and asyncio don't wake up, but they also don't handle ctrl-c. So at least on Windows, this seems correlated.

EDIT: I just tested trio, it really seems to be working correctly. No idle resources and ctrl-c handling works on Windows.

njsmith commented 6 years ago

Huh, you're absolutely right:

https://github.com/twisted/twisted/blob/9a7ce38d10e28dda92ecf7174856ba59096d6b83/src/twisted/internet/selectreactor.py#L33-L36

Anyway, this is getting off-topic :-). The regular wakeups do have the effect of giving Python's built-in signal handling logic a chance to run and notice a signal has arrived; but you can also wire things up so that a signal arriving will actually trigger a wakeup, and on python 3.5+ this is extremely easy using set_wakeup_fd.

dabeaz commented 6 years ago

Looking at that bit of code (from Twisted), I suspect that Curio would want the ticking for Control-C as well. Although Control-C can be handled in Curio, it prefers to do nothing special about it unless you specifically write code to handle it.

njsmith commented 6 years ago

@dabeaz You can still call signal.set_wakeup_fd to get woken up when any signal that has a python signal handler arrives, and then once the interpreter is awake it will run the signal handler like normal. It doesn't force you to take over the actual signal handling.

akalsi87 commented 5 years ago

The main unknown in the current implementation is the timeout value of 1 second. Certain algorithms such as distributed consensus algorithms (Raft, Paxos), often set a lot of timeouts in the range of a few hundred milliseconds. So, maybe a tick of 100ms might make more sense.

I'm trying to create a raft implementation in Python using curio as the underpinning. Is this remark stating that multiple timeouts in the 1e2 ms range would perform worse than expected? Maybe the bucketing algorithm in timequeue could offer a customization point also.

dabeaz commented 5 years ago

A scaling factor or some other customization might not be a bad idea. It's hard to say what the performance impact of the current scheme would be on something like Raft. I haven't implemented that nor do I have any performance measurements to guide it.

The main impact of sub 1sec timeouts is going to be memory. In that time range, timeouts that never expire aren't removed from the timeout queue, but are simply abandoned and discarded when the elapsed time expires. With rapid traffic, you could end up in a situation where the timeout queue has one valid timeout and several thousand invalid timeouts.

imrn commented 5 years ago

one valid timeout and several thousand invalid timeouts.

Does selector wake up for all of them?

akalsi87 commented 5 years ago

I don’t think so. But those have to be processed on the “next deadline”. Customising the timeouts and the buckets can space out that cost between events.

dabeaz commented 5 years ago

My reading of the source is that the selector might indeed wake up for all invalid timeouts. I'm doing a bit more digging into it. This would obviously be a source for some optimization if so.

dabeaz / curio

Process wakes up every second (1000 milliseconds) #282