Open jnb opened 12 years ago
I haven't seen this behavior. After the new zygote initializes master starts transitioning workers and if they're still around after a while (10 mins) it sends a sigkill. With very simple applications and a few clients in my test it generally takes a few seconds to transition workers. Can you provide a test case that we can use to reproduce this?
More info: this happened with the d52d561247 Zygote version. Is it possibly that you've already addressed this problem?
Supplying a test case could be hard, but in any case I'll keep watching for this behavior.
Okay, please let us know if you encounter this again. I think we can keep the issue open for the moment and re-visit some time later.
John, did you see this happening again? We had a bunch of fixes related to how we handle worker transition and I haven't seen this behavior.
Just saw this again with 183c902
. One old worker and one new worker were running when I ssh'ed in and looked.
[It's not clear to me whether this is a new issue; if not, then please feel free to close as a duplicate.]
I saw this last week: I pushed an upgrade to a Zygote-hosted webapp, but some Zygote instances kept on serving the old version of the webapp. I had to manually restart those Zygote instances.