SiaFoundation / siad

The Sia daemon
https://sia.tech
MIT License
130 stars 28 forks source link

Constant "failures" since adding a fifth hard drive #114

Closed cristian-dan-f closed 2 years ago

cristian-dan-f commented 2 years ago

Hi, I barely use github and not used to reporting issues.

I installed Sia in a Windows Server 2016 virtual machine.

It was working fine for the las 2 months, but i added a fifth hard drive and since then siad is constantly failing.

The strange part is that the drive failing is not the new one, and most important there are no read/writes errors in Windows Events Registry.

I have 50.000 events saying: Warning: {Delayed Write Error} Windows was unable to save all the data for the file G:\SiaDATA4\siahostdata.dat; data was lost. This could be because the device was removed or the media is write-protected.

Information: Popup Application: Windows - Delayed Write Error: Exception Processing Message 0xc000a082 - Unexpected parameters

And in siad-stdout.log this keeps repeating thousands of times:

2022-05-09 20:15:29 error: goroutine 4226551 [running]: runtime/debug.Stack(0xd4, 0xc001b78380, 0xd4) runtime/debug/stack.go:24 +0xa5 runtime/debug.PrintStack() runtime/debug/stack.go:16 +0x29 gitlab.com/NebulousLabs/log.(Logger).Severe(0xc0001a0f60, 0xc0073d1f88, 0x2, 0x2) gitlab.com/NebulousLabs/log@v0.0.0-20200604091839-0ba4a941cdc2/log.go:157 +0x314 go.sia.tech/siad/modules/host/contractmanager.(writeAheadLog).syncResources.func3(0xc01d9a21f0, 0xc0000d9be8, 0xc0001f8300) go.sia.tech/siad/modules/host/contractmanager/writeaheadlogsync.go:78 +0xec created by go.sia.tech/siad/modules/host/contractmanager.(*writeAheadLog).syncResources go.sia.tech/siad/modules/host/contractmanager/writeaheadlogsync.go:74 +0x1c7 Severe error: (siad v1.5.7, Release: release) ERROR: unable to sync a storage folder: sync G:\SiaDATA4\siahostdata.dat: No se puede completar la operación solicitada por una limitación del sistema de archivos.

2022-05-09 20:15:29 error: goroutine 4226703 [running]: runtime/debug.Stack(0xd4, 0xc0001249a0, 0xd4) runtime/debug/stack.go:24 +0xa5 runtime/debug.PrintStack() runtime/debug/stack.go:16 +0x29 gitlab.com/NebulousLabs/log.(Logger).Severe(0xc0001a0f60, 0xc00eb79f88, 0x2, 0x2) gitlab.com/NebulousLabs/log@v0.0.0-20200604091839-0ba4a941cdc2/log.go:157 +0x314 go.sia.tech/siad/modules/host/contractmanager.(writeAheadLog).syncResources.func3(0xc009cf3fd0, 0xc0000d9be8, 0xc0001f8300) go.sia.tech/siad/modules/host/contractmanager/writeaheadlogsync.go:78 +0xec created by go.sia.tech/siad/modules/host/contractmanager.(*writeAheadLog).syncResources go.sia.tech/siad/modules/host/contractmanager/writeaheadlogsync.go:74 +0x1c7 Severe error: (siad v1.5.7, Release: release) ERROR: unable to sync a storage folder: sync G:\SiaDATA4\siahostdata.dat: No se puede completar la operación solicitada por una limitación del sistema de archivos.

------------ END ------------

It doesnt seem to be any diferent message, I keep going up and down the logs and I cant see anything else. Even now writing this issue here, a several tens of these messages added in.

I have tested the hard drive and it has absolutly no issue. All hard drives have at least 200GB of free space. And all the hard drives are entirely dedicated to the virtual machine and only Sia is installed in it. The virtual machine has 4 dedicated CPU cores usualy working at 60-70% CPU and the entire system is usualy arround 30% - 40%. I dont think is overcrowded.

What I can see is that at some point hard disk C: (main ssd windows hard drive) and hard disk G: are at 100% usage, and even when siad service closes, these hard disks keep working at 100% indefinetly. Also RAM usage is at 99% even when siad service closes. I have tested with 10GB up to 40GB of RAM and it seems that siad takes all the RAM available, but at some point it closes and the RAM is not freed.

Usualy if siad ever failed, it was posible to simply close it, and open it again. Now it is necesary to completly reboot the virtual machine.

My bigest problem is that siad always closes at night, which is a strange coincidence, because I stay in front of the computer 14 hours every day, and siad seems to always break 20-40 minutes after I go away.

What should I do next?

n8maninger commented 2 years ago

Closing as this is an operating system issue, not a Sia issue. Please use Discord for support.

No se puede completar la operación solicitada por una limitación del sistema de archives.

This error is from a limitation with Windows NTFS formatted disks; you can try defragging the drive.

https://social.technet.microsoft.com/Forums/lync/en-US/e0d55018-cf06-43ca-8e20-dd2bb8494c96/8220the-requested-operation-could-not-be-completed-due-to-a-file-system-limitation8221?forum=winserverfiles

cristian-dan-f commented 2 years ago

Thank you for the link. It might look like a system problem, but:

  1. defrag says it is optimized at 100%
  2. so it would rather be the "Compress feature" ( that the Sia manuals says to activate ), causing the problems. I try to disable it, and an error says it cant be done because of a system limitation, so I have to empty the drive.
  3. Hard drive affected is 3.51TB size and somehow Siad shows 3.86TB. Not sure how this happened, maybe it was my fault or maybe the slider when moved to 100% is buggy, anyway I have to reduce this size.

I have been trying for the last 2 days to solve this two problems, by reducing this size of the folder, and the only way is by completly removing it, because I tried to resize and it faills saying there is not enough space, so that option is buged too. It would be great to have button to click it and tell siad not to store any more data in the drive. Just like the Remove button, right next to it, could be a Stop button. It would be very usefull for me right now. And to disable the Compress drive feature it seems that it has to be completly empty, so for now I am trying move all the data from the drives with wrong size, wich seems to be a imposible task. Hard drive C: has no error reported, but after 40 - 60 minutes of usage it goes to 100% usage and system is requesting +50GB of RAM wich completly blocks and after some minutes crashes the virtual machine. This sinthom is new, started after writing my first message 2 days ago. Machine has to be disconnected and booted every hour. I am going crazy since I dont want to fail any contract. For some reason Siad is requesting huge amounts of RAM, and I feel a bit limited trying to solve this. Also I find it important, that while Sia is removing the data from drive G: and 1TB of data was moved but Windows is showing no change in hard drive usage. Maybe is normal, or maybe because the machine crashed, no idea. As far as I understand there is no data corruption yet.

I have another machine with 64GB of free RAM, and if system keeps crashing i will move Sia to that machine, but I have the feeling it will just request more and more RAM, maybe it is a thing to look into: this spike in RAM usage to +50GB demand, during removal of a folder in Sia.

I will update this when I manage to remove the compress feature and fix the size of the folder.

If there is a better way to remove a folder without using the Sia Ui client please tell me.

cristian-dan-f commented 2 years ago

After 1 week of trying everything to get the host to run fine, it seems to be imposible.

Something went sideways after adding the fifth 18tb hard drive.

Siad simply crashes after 40-60 minutes working.

I stoped accepting new contracts and at least it seems to be working fine.

I will wait untill all current contracts are fufilled and try again from 0 with no compresion, which seems to be the reason of all my troubles.

Unfortunetly this issue has been closed but no fix.

cristian-dan-f commented 2 years ago

I was hoping with latest update v1.5.8 this would be resolved.

siad can be working for days and weeks with no problem, as soon as I hit the "remove" drive button, ram usage increases to infinite and in aprox. 40 minutes system crashes. I tried giving the VM 64GB of ram, it will simply give it some extra minutes but eventually the demand of ram is much more than the available and system crashes.

Still 178 contracts to end, and make a fresh start.

It might be a "operating system issue" but I am very sure that something can be improved, maybe you cant fix the write to disk error, but the ram usage maybe can be fixed.

Also I am very sure that the hard drive option "Compress this drive to save disk space" is what caused my problems. It seems that this Windows feature is bugy and there are many message in the internet with people having trouble with it. My humble opinion is that Sia install manual should not advise to enable that option. Everything was fine untill I wanted to remove a folder, something glitched, and hard drives dont accept data past a point. I try to remove the glitchy folder, but it is imposible because system crashes.

btw in the "Host" tab -> Storage Folders -> folders are constantly bouncing up and down every few seconds.

Keep up the good work!