Azure / Guest-Configuration-Extension

Azure Guest Configuration Virtual Machine Extension for Linux
Apache License 2.0
11 stars 8 forks source link

Custom Guest configuration of VM shows as pending / inprogress #167

Open mitenmota opened 2 weeks ago

mitenmota commented 2 weeks ago

guest configuration pending

We have been facing issues on few VMs where we have applied custom guest configuration, the status of it in the VM configuration management pane is pending, and on Azure policy compliant status is non-compliant.

Initially it was working as expected and VM was complaint.

under gc_agent log file in "C:\ProgramData\GuestConfig\gc_agent_logs" We can see an below error's.

image

few Min's before it was working as expected.

image

below work around was found : If we update the tag mentioned in below article, it started to work with private networking. https://learn.microsoft.com/en-us/azure/governance/machine-configuration/overview#communicate-over-private-link-in-azure

But the question is, why the connection suddenly stopped working on public network? ( no changes in network has been done, and outbound towards internet port 443 is already opened)

xkaspers commented 1 week ago

We have the exact same issue. It looks like this has started since extension version 1.29.75.0. Might be earlier versions as well, but at least version 1.29.71.0 and lower seem to work as we also have VM's with extension version 1.29.71.0 and those are working fine.

We use Azure Firewall in the environment, and can see in the logging errors with "SNI TLS extension was missing" for Virtual Machines that do not connect correctly to the guest configuration url's.

The workaround with the tag for private networking works. When adding the tag and restarting the guest configuration service in the OS, the connection starts working directly.

rbnmk commented 2 days ago

I can confirm with extension version 1.29.75.0 I have the same issue. I noticed that Azure Policy was not shown as compliant because the status was "NoComplianceReport". Adding the tag EnablePrivateNetworkGC:True to the VM resource and restarting the "GCService" (Restart-Service -Name GCService via Run Command) resolved the problem.

After restarting the service I reviewed the logs in C:\ProgramData\GuestConfig\gc_agent_logs and it seems to notice that the tag is on the VM and then succesfully connects to the agentserviceapi.guestconfiguration.azure.com endpoint.

I am now wondering why this is happening, we are migrating a big number of VM's and don't want to do this for each virtual machine as it worked fine in the past.