Closed Tornhoof closed 2 years ago
Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @Drewm3, @avirishuv.
Author: | Tornhoof |
---|---|
Assignees: | - |
Labels: | `Compute - VM`, `Service Attention`, `customer-reported`, `question` |
Milestone: | - |
Compute team, please help to look into this question.
@avirishuv can you please take a look at this issue?
Investigating on the API side as well on the PS side, and will provide update. cc @bilaakpan-ms
hi @Tornhoof update on this issue: there is a bug we have identified on the server side which is likely the reason you are seeing the issue. The fix is currently in the backlog and we'll keep you posted about its progress/rollout.
quick update: the fix is still in the backlog and planned to be worked on over the next few weeks. Apologies for the delay here.
Hi @avirishuv. I guess this does not influence only Restart-AzVM
but also other commands like Invoke-AzVMRunCommand
. I am seeing the same issue as @Tornhoof described and it is close to impossible to do some automation with Azure VMs. Do you have some ETA for this to be fixed?
@petermicuch thanks for reporting, let me check and get back to you.
@avirishuv this might have been my mistake. I had a bug in script. After I fixed it, the Invoke-AzVMRunCommand
seems to work fine. Restarting of machine in my case now also works fine...I started to capture return value from Restart-AzVM
, but that should hopefully not have influence on the behavior.
thanks for the update @petermicuch , good to know that the Invoke-AzVMRunCommand is working fine.
@avirishuv is there any workaround to find out that VM finished restarting until we get a fix? It happened to me many times, that Restart-AzVM returned, but machine has not yet restarted at all. So when I execute next command with Invoke-AzVMRunCommand, that one gets interrupted. I.e. in my example I am installing domain services on the VM and after restart, I should see different fqdn, but I do not, since the restart did not happened even though Restart-AzVM returns.
@petermicuch checking for the power state of the VM before you trigger the Invoke-AzVMRunCommand could be one possible workaround. If the VM has reached "Running" state then restart has completed, if not wait for some more time before re-checking.
@avirishuv actually that does not work, since VM is in state running and only restarts later as also @Tornhoof described. Since in this special case I installed domain controller, I just check if the FQDN is full domain name. By that I know that restart has completed, because before that, FQDN does not contain domain yet. Other possible solution would be to have script running on startup, that sets some environment variable always and you just reset it before restart. But the best would be to have final fix from you :-).
@avirishuv any update on rolling out a fix? this is a serious show stopper, we've experienced too many Restart-AzVM operations where the VM actually performs restart only 1h later!
joining to @petermicuch question above, is there any workaround to find out that VM finished restarting until we get a fix?
quick note: the fix for this is currently in progress, I'll provide an update here once its rolled out.
@avirishuv Has the fix been rolled out?
hi @Viajaz fix rollout is in progress, I'll confirm once it has completed roll out to all regions.
Following issue...
Following issue
Has the fix been rolled out? We are still seeing some issues with the command.
Still appears to be an issue. @Viajaz, @avirishuv , any update?
The fix is currently being rolled out to production regions, will update once this is completed.
@Tornhoof Apologies for the late reply. We are looking into this issue and we will update this thread once we have more details.
@avirishuv Could you please provide an update on this issue ? Awaiting your reply.
@avirishuv Do you have any updates?
The fix has been rolled out. Closing the issue now, please feel free to reopen if you have any additional questions on this topic.
what version was this resolved in?
Description
Recently one of my automated test scripts for a web application started failing constantly, after working fine for 2-3 years (I once migrated from the old AzureRM namespace). The script does the following steps:
As of a few weeks (maybe 2-3 weeks) ago this worked fine, and the Restart behaviour worked synchronously and it more or less immediately restarted the VM and busily waited until the restart was done and step 4 worked fine
As of now it apparently reboots in the middle of my Step 4 and the script aborts with connection loss errors. The Azure Portal Activity log for that machine lists a Restart VM entry at the correct time, but the actual Event log in the system with the appropriate system events about stopped services is like 5 minutes later.
I've now removed the Restart step from my script (it's not necessary anymore, maybe it was necessary back in the win2016 image times) and my scripts works again, so I currently conclude that some behaviour change happened to Restart-AzVM.
I currently have a few ideas what might be the culprit:
I kinda think it's something like 2 where e.g Windows Update running on the system is now preventing the reboot until it's done and previously was just working because nothing blocked.
Environment data
I also tried with the most recent 7.1, but no difference
Module versions
I tried with different older versions too, as some systems where i tried it on, didn't have the most recent Az packages installed