grafana / pyroscope-rs

Pyroscope Profiler for Rust. Profile your Rust applications.
Apache License 2.0
132 stars 22 forks source link

fix spurious exit when epoll_wait is interrupted by a signal #125

Closed alindima closed 11 months ago

alindima commented 11 months ago

I discovered a hang in the pyroscope agent, that is triggered when the Timer thread gets interrupted by a signal. Instead of retrying the epoll_wait call, the Timer thread simply exits and no data is fed into the server.

This PR checks the epoll_wait error and retries the call if it gets EINTR.

alindima commented 11 months ago

@omarabid @korniltsev can I get a review?

korniltsev commented 11 months ago

I may look into it this week. How do I reproduce? Just SIGSTOP SIGCONT?

korniltsev commented 11 months ago

Not related to the PR and the issue but I noticed timer_fd and epoll_fd are not closed during any error in the epoll thread and seem to be leaking in case of an error.