Open BrucePerens opened 3 years ago
Also hangs with -Dpreview_mt
Thanks for updating this thread, Bruce.
Thanks for updating this thread, Bruce.
You're welcome! At the moment, I don't know if this is:
However, the Crystal system code is still young, and it's likely to have untested bugs that will only pop up if you do something like make 50K API calls overnight.
The event loop hangs after the epoll_wait system call is interrupted. The Crystal code is in
HTTP::Client#get
but I think these system calls are fromlibevent
. This is what I see onstrace
:and it just keeps doing that epoll_wait with an empty event structure over and over again. It looks like because of the interrupted system call, it has dropped the event for FD 5 becoming readable. One solution might be to retry all I/Os in non-blocking mode when this happens. Or maybe it's really simple and there is a signal we can mask? The system is Debian Testing on X86-64-Linux-GNU This is a days-long API client run, and it runs for about 12 hours before this happens.
Is there anything more you would like me to do to instrument this probllem?