Azure / azure-linux-extensions

Linux Virtual Machine Extensions for Azure
Apache License 2.0
304 stars 253 forks source link

OMI Restarted But Not Staying Up. Will Be Restarted In The Next Iteration. #537

Open sweigand opened 6 years ago

sweigand commented 6 years ago

Had a weird issue start on March 8, 2018 on my CentOS7-based VM: WAAgent started spitting out "diagnostic.py[21347]: OMI restarted but not staying up. Will be restarted in the next iteration."

OMI shows the following:

systemd[1]: Starting OMI CIM Server... systemd[1]: PID file /var/opt/omi/run/omiserver.pid not readable (yet?) after start. systemd[1]: Started OMI CIM Server.

sweigand commented 6 years ago

I pull these lines from /var/log/azure/Microsoft.OSTCExtension.LinuxDiagnostic/2.3.9029/extension.log

2018/06/25 00:00:17 ERROR:[Microsoft.OSTCExtensions.LinuxDiagnostic-2.3.9029] OMI restarted but not staying up. Will be restarted in the next iteration. 2018/06/25 00:00:47 ERROR:[Microsoft.OSTCExtensions.LinuxDiagnostic-2.3.9029] OMI noop query failed. Output: /opt/omi/bin/omicli: result: MI_RESULT_FAILED 2018/06/25 00:00:47 ERROR:. OMI crash suspected. Restarting OMI and sending SIGHUP to mdsd after 5 seconds. 2018/06/25 00:00:47 [Microsoft.OSTCExtensions.LinuxDiagnostic-2.3.9029] RunCmd /opt/omi/bin/service_control restart 2018/06/25 00:00:47 [Microsoft.OSTCExtensions.LinuxDiagnostic-2.3.9029] Return 0: 2018/06/25 00:00:47 [Microsoft.OSTCExtensions.LinuxDiagnostic-2.3.9029] OMI restart result: 2018/06/25 00:00:52 [Microsoft.OSTCExtensions.LinuxDiagnostic-2.3.9029] RunCmd /opt/omi/bin/omicli noop 2018/06/25 00:00:52 ERROR:CalledProcessError. Error Code is 1 2018/06/25 00:00:52 ERROR:CalledProcessError. Command string was /opt/omi/bin/omicli noop 2018/06/25 00:00:52 ERROR:CalledProcessError. Command result was /opt/omi/bin/omicli: result: MI_RESULT_FAILED 2018/06/25 00:00:52 [Microsoft.OSTCExtensions.LinuxDiagnostic-2.3.9029] Return 1:/opt/omi/bin/omicli: result: MI_RESULT_FAILED

abhijeetgaiha commented 6 years ago

@sweigand LAD 2.3 is deprecated. I would suggest upgrading to LAD 3.0. Is that possible for you to do?

sweigand commented 6 years ago

@abhijeetgaiha This was installed on the VM as an extension on Azure. The name of the extension was "Microsoft.Insights.VMDiagnosticsSettings".

adriel commented 5 years ago

@sweigand Yeah you can't install LAD 3 from the Azure Portal

This is a quote from: https://github.com/Azure/azure-linux-extensions/tree/master/Diagnostic

LAD 3.0 is NOT compatible with LAD 2.3. Users of LAD 2.3 must first uninstall that extension before installing LAD 3.0.

LAD 3.0 is installed and configured via Azure CLI, Azure PowerShell cmdlets, or Azure Resource Manager templates. The Azure Portal controls installation and configuration of LAD 2.3 only. The Azure Metrics UI can display performance counters collected by either version of LAD.