Azure / AzureStackHCI-EvalGuide

Welcome to the Azure Stack HCI Evaluation Guide!
Creative Commons Attribution 4.0 International
141 stars 84 forks source link

DNS Server not resolving, indefinitely waiting for AD DS to signal that initial sync has been completed. #21

Closed laurenbo closed 4 years ago

laurenbo commented 4 years ago

Hi - just ran into this, impossible for MGMT01 to join AZSHCI or azshci.local domain, "An AD DC for azshci.local could not be contacted". I suspected a DNS server issue on DC01, and found many 4013 warnings in the DNS Server event log, obtained with WAC running on MGMT01. warning 4013 description: The DNS server is waiting for Active Directory Domain Services (AD DS) to signal that the initial synchronization of the directory has been completed. The DNS server service cannot start until the initial synchronization is complete because critical DNS data might not yet be replicated onto this domain controller. If events in the AD DS event log indicate that there is a problem with DNS name resolution, consider adding the IP address of another DNS server for this domain to the DNS server list in the Internet Protocol properties of this computer. This event will be logged every two minutes until AD DS has signaled that the initial synchronization has successfully completed.

Quick advice could be adding a "how to trigger this initial synchronization" line. I got the WS 2019 version 1809 from the URL and passed all Windows Update as recommended. The OS Build is now 10.0.17763.737. Maybe this issue is related to some update after you wrote the tutorial.

mattmcspirit commented 4 years ago

Hi - thanks for this. I haven't seen this in my environment, but will have to check. After rebooting the DC, and leaving it for a while, does the issue still occur? Can you join the domain?

Is there anything different about the way you set up the environment, from the steps I provided?

Thanks!

laurenbo commented 4 years ago

Hi Matt - -no I still cannot join the domain, -yes I rebooted the DC, I even rebooted the host, but the DNS is still waiting and generating warning 4013. I only used the provided powershell scripts, and have the WS 2019 running in "core" mode, no Desktop. I was thinking to install RSAT for WS 2019 on MGMT01 to have AD and DNS consoles and try to "trigger" that first synchronization... as I don't know how to trigger it from PowerShell. And as the DNS is not working on DC01, I cannot get updates for Edge in MGMT01, though I can ping 1.1.1.1 or 1.0.0.1 successfully.

mattmcspirit commented 4 years ago

Hi - that's the same environment I'm testing with. My event log for the DC shows a single event entry (after reboot) for ID 4013, but after that, it doesn't show again. My build on DC01 Version 10.0.17763.1339, which is the latest according to the list of WS updates.

So, I'm a little unsure I'm afraid - I'm going to try to spin up a new environment using PowerShell and see if anything has changed, but it may be worthwhile deleting and creating a fresh DC, rather than troubleshoot more deeply.

I'll redeploy and let you know.

laurenbo commented 4 years ago

OK - weird that with the WS updates I'm on 17763.737 and you got 17763.1339, which is what I have on the host. so, on my side I'm exporting DC01 as is, to be able to restore it later, and recreate one from scratch, with the same procedure. Thanks Matt,

mattmcspirit commented 4 years ago

Let me know - i'll let you know how I go shortly. I ran the update script twice to be sure, and it's fully up to date. If you exit to cmd, and run ver.exe, you'll see the build.

mattmcspirit commented 4 years ago

So my environment is fine - just run through the setup without issue. I have updated the new AD user PS script in the guide, but that wouldn't affect things. Win10 took a while to update, then i joined the domain without issue.

Let me know how things go on your fresh DC!

laurenbo commented 4 years ago

Matt - good that you could reach the expected result :-) but I'm having trouble with the WS updates step. After applying them, waiting for reboot, logging in again, I get "We couldn't complete the updates. Undoing changes. Don't turn off your computer". I waited until next reboot, checked ver.exe which says I'm still with 17763.737. I re-ran the WS updates step, with the same results. Which explains why I still was also with 17763.737 in my first experience, though I had not noticed the error at reboot time. => I'm now using sconfig.vbs to force WS updates from inside the VM.

mattmcspirit commented 4 years ago

That is very strange indeed, however, going via SCONFIG should be fine - however you wish to update is fine by me.

I guess there's a chance the ISO download was slightly corrupt, but it's a stretch to say the least. However, if you can't do something as basic as update the OS correctly, either there's an issue with the OS itself (which may be worth re-downloading the ISO) or there's some kind of network issue - I'm speculating, as it just seems so strange for essentially a vanilla Windows Server 2019 VM.

One extra question, is this on your own host, or in Azure? If it's your own, can you tell me about the physical system specs, OS etc?

Thanks!

laurenbo commented 4 years ago

Matt - I do confirm that the WS updates download & install with SCONFIG went fine, and that DC01 is now with 17763.1339. All subsequent steps went fine, I could get the Edge Update from MGMT01, join MGMT01 to azshci.local domain, logon as labadmin, setup WAC and connect to WAC as labadmin without re-entering credentials. All seems fine for me now. I'll process the nodes in the next step. My machine is a physical server, an HP DL560 G8 with 256 GB RAM and 4 sockets equipped each with Intel Xeon E5-4657L v2, totalling 96 cores. Local SAS 6G disks. Network is a 4x 1Gbps LAN-on-Motherboard adapter from HP. OS is WS 2019 DC (en-US) from my MSDN subscription, with Desktop Experience, that is a Version 1809 with OS Build 17763.1339. Culture is fr-FR. & Again, thanks for your time confirming all OK on your side.

mattmcspirit commented 4 years ago

OK, strange it didn't work with the first DC, and i'm confused that an update would magically fix something as fundamental as that, but at least it seems to be OK now - I'll close this out and hopefully all will be fine from there.