TechnitiumSoftware / DnsServer

Technitium DNS Server
https://technitium.com/dns/
GNU General Public License v3.0
4.27k stars 418 forks source link

Zone lost after several unclean shutdowns #621

Closed micush closed 1 year ago

micush commented 1 year ago

Hi,

I'm running a test instance in a Ubuntu 22.04 KVM VM with the test zone "lan". Most of the time if I hard power off the VM the zone survives and is loaded on the next boot. However, sometimes after a hard power off or reset the zone disappears and I have to recreate it or restore it from backup. Can you investigate please? It's a bit cumbersome to have to recreate the zone sometimes after a hard power off.

Thanks and Regards,

micush

ShreyasZare commented 1 year ago

Thanks for the post. This is not an issue with the DNS server and this issue cannot be fixed with any kind of software too.

The zone are are missing since your VM's file system is getting corrupt. Someday, a critical system file will get corrupt and then your VM will stop booting.

So, never do hard power resets and get a UPS for your hardware too to prevent it due power fluctuations.

micush commented 1 year ago

This is not even remotely the case. I run GNS3 on my laptop (with a good battery) and this VM within GNS3. I have a few primary zones defined in TDNS. Only the specific zone "lan" is disappearing. None of the other zones disappear after a few power resets, just this one specific zone. Either way I'll figure it out. Thanks for the sage computing advice. Much appreciated.

Hemsby commented 1 year ago

Do you have any logs? Before and after the force reset?

ShreyasZare commented 1 year ago

The zones are files on the disk. If the specific "lan" zone is missing then it means that the "lan.zone" file is missing from the file system.So, it is a file system corruption issue caused due to the OS unable to flush the file system cache to disk before the hard reset.

echodreamz commented 1 year ago

So... you are hard powering off a VM and wondering why bad things are happening?

micush commented 1 year ago

It's ext4. It's journalled. There should never be data loss with it, even in the event of a power failure. And indeed, I have zero data loss with it. Except for one specific TDNS zone out of many.

echodreamz commented 1 year ago

You should also not be hard powering off the instances either, that's just bad practice/behavior. Perhaps shutting the instance down cleanly and correctly would be the correct behavior and fix? I've had data loss on ext4 volumes before, just because it's journaled, doesnt mean it's fully tolerant.

ShreyasZare commented 1 year ago

It's ext4. It's journalled. There should never be data loss with it, even in the event of a power failure. And indeed, I have zero data loss with it. Except for one specific TDNS zone out of many.

You can test this issue easily. Next time when this issue comes up, just check the config/zones folder to see if the zone file for the lost zone exists or not. The DNS server will not delete the zone file unless you delete the zone manually. So, if the zone file is missing then its file system issue.

If the file exists but is not showing as zone in DNS server then the contents of the file must be corrupt and you should have error log entry for it in DNS logs. As far as I know, the journaling only works with meta data and the file contents are prone to corruption on power loss.

ShreyasZare commented 1 year ago

Closing this issue since it cannot be fixed in code.