timja / jenkins-gh-issues-poc-06-18

0 stars 0 forks source link

[JENKINS-46215] Restarting a Windows slave causes the master to shutdown #9307

Closed timja closed 6 years ago

timja commented 7 years ago

When we restart or connect a Windows slave, the master shuts down. This doesn't happen consistently, but now that we are restarting the slaves over night, the master is down pretty much every morning.

The master log just says:

Aug 16, 2017 4:41:48 AM winstone.Logger logInternal
INFO: JVM is terminating. Shutting down Winstone

No exception or other error.

We had this issue for some time which happened sporadically, when connecting a Windows slave.

Our Windows slaves are on Windows 7 64bit and uses the Jenkins agent installed as a service. When we restart the machine, the node is not brought temporary offline, that is, it goes offline because the slave is disconnected, then comes back online by itself. I wonder is that could be an issue?

How can I get more information on why the JVM is shutting down?


Originally reported by mviargues, imported from: Restarting a Windows slave causes the master to shutdown
  • status: Closed
  • priority: Major
  • resolution: Done
  • resolved: 2017-10-30T21:55:36+00:00
  • imported: 2022/01/10
timja commented 7 years ago

oleg_nenashev:

Do you have master and agent on the same machine?

Please provide service configuration files for both instances

timja commented 7 years ago

mviargues:

Thanks for the reply.

No they are different machines, the master being an Ubuntu server.

Which configuration files are you referring to?

timja commented 7 years ago

oleg_nenashev:

timja commented 7 years ago

mviargues:

I have uploaded the the config files. We also have Mac slaves but I suspect it's the Windows ones causing the problem. Note that I had to obfuscate job names for confidentiality reasons.

I noticed we don't have Email notifications on the Mac ones, maybe that could be related?

timja commented 7 years ago

mviargues:

It happened again this morning. This time I have a Windows 7 VM that I use for testing on my machine, and when I woke up my PC this morning (therefore the VM too), Jenkins when off-line at this exact same time.

I have disabled all email notifications so that's not the problem.

timja commented 7 years ago

oleg_nenashev:

No idea, I have tried to reproduce it ~1 week ago, no success.

timja commented 7 years ago

mviargues:

Shame It happened again this morning, that's quite annoying now I am scared to just bring a node online and crash the system. Is there a way to get more information on why it shuts down? More loggin or something?

timja commented 7 years ago

oleg_nenashev:

Let's start from the full System log. Maybe JVM is crashing in an elegant way. Ideally you could just install a Support Core plugin and upload the entire bundle, maybe we will discover something there

timja commented 7 years ago

mviargues:

Ok I'll do that and send it when I've got it crashed again.

timja commented 6 years ago

mviargues:

Well good news I've found the problem... quite anticlimactic though. It was due to one of our plugin, that we forked from the diskcheck plugin. The version we forked from had a System.exit(0) in the code on a very edge case that was only happening when the slave didn't have disk information, which was the case when the slave was connecting for the first time. Fortunately I saw the build log when it happened.

Such a waste of time... but anyway I am glad it's fixed.

Thanks for your help.