Closed daipom closed 2 weeks ago
It would be necessary to consider exclusive locks.
Somehow, the fork process couldn't receive SIGTERM, so I gave up using fork
for the tests.
It may be better to note explicitly the scope of this PR (out of scope live-restart for supervisor, we focus on server <=> worker) https://github.com/treasure-data/serverengine?tab=readme-ov-file#live-restart
Thanks for your review! This PR allows us to restart network servers without socket downtime. I fixed the description of this PR and the commit message.
I'll rebase this.
Thanks for your review!
Another process can take over UDP/TCP sockets without downtime.
This starts a new server that shares all UDP/TCP sockets with the existing server. The old process should stop without removing the file for the socket after the new process starts.
This allows us to replace both the server and the workers with new processes without socket downtime. (The existing live restart feature does not support network servers. We can restart workers without socket downtime, but there is no such way for the network server.)
ref: https://github.com/fluent/fluentd/issues/4622
Limitation
address already in use
error.TODO