Closed fieldofgreen closed 11 months ago
When you say it freezes at random intervals is that when it's wiping the disc? How does it freeze, does the screen freeze so there is no updates including the ETA and throughput or is it just the wipe that freezes and the ETA & throughput keep changing? Or does it freeze before displaying nwipe?
It sounds more like a hardware issue at the moment, nothing has changed in the wiping thread code from between 0.34 & 0.35. I've not seen any hangs on my hardware so far and there has been over 3K downloads so would have thought there would be other reports of hanging or freezing if it was the software, unless it's a subtle bug buried somewhere.
Hmm it could be a hardware thing. I feel like it's one of the drives I've been wiping but I'm not sure yet. It boots and runs fine and then I've had one at 3 minutes, one at half an hour, another 15 minutes but it will stop responding, no updates to anything. I suppose I should let it run for the expected time to see if it's just a visual bug. I have one running over night with a different boot drive so I will let you know more in the morning.
Oh ok I found the issue. apparently one of the drives I was wiping, A 512 GB "SP solid State Drive A55 3D Nano" just fails and locks up the system. unsure why exactly it locks up rather than just erroring out but now i know.
That's good. Interesting that it hangs the system. All the calls within nwipe that do the reading, writing and syncing have the return status checked. Not only that each drive being wiped is operating in it own thread so all operating independently so one drive failing doesn't affect the wiping of the other drives.
It does sound like a kernel level bug maybe with a driver not handling errors or unresponsive drives properly.
Ok so, I have a different external hard drive that apparently locked the system up around 20 minutes in. I tried starting it on v2021.08.2_23_x86-64_0.34 and its gotten through 12 hours overnight, So I'm not sure what is causing the lockup. I'm going to try a couple other drives with the same make/model as the ssd that failed before on the newest release to see if those also get stuck to tell me its related to the hardware reacting to the new version/drivers in the new version.
I also want to try to see if I can pull up the alt f2 terminal as I don't recall if I can while its frozen.
I know the 2 drives with issues so far were involved in a crypto attack, Granted I have no clue if that would be capable of breaking the drive itself.
I might also see about testing some drives that I have wiped previously using older versions, to see if they behave oddly.
A bit of trial and error in my future, Id like to find out why my setup suddenly has these problems if no one else seems to.
I will update you with any relevant info that I find.
Ok so I have run a couple of tests. I cannot use the other terminals. I ran a similar SSD (it just had less storage capacity) and it passed without a problem on the newer version. I ran the drive that went through most of a DOD 3 pass run on the older version and it stopped at 50 minutes on the current version. Going to see if wiping it completely on an old version lets me wipe it on current I guess. not too sure what else to try but its weird that it has only happened recently and on 2 mostly unrelated drives.
I thought I would double check and see how ShredOS/nwipe behaves if a drive stops responding in the middle of a wipe. So I started a wipe then pulled the SATA lead off the drive. From the moment I pulled the lead off the drive the throughtput and time remaining continue to count as I would expect them to as they are a averaged values not instantaneous values. Then about 5 seconds after the SATA connection was pulled nwipe displayed an I/O Error message (red text on white background) as it would have detected the failure via the return value of the fdatasync. The interface remains responsive.
Can you try a few things.
Apend nomodeset to the kernel command line in both grub files and see if it still hangs, if it still hangs then it may not have anything to do with the DRM graphics software. nomodeset uses a simple framebuffer rather than DRM graphics.
Are you able to try wiping these drives on a different computer? This would rule out the computer hardware.
Can you run a overnight memtest to see if it detects any intermittent faults in the CPU memory.
Sorry I was out of the office yesterday but I will start those today. nomodeset should be testable in the next few hours and I will set up another computer to start on the other drive to test that. I will start that memtest after the nomodeset runs.
Other other thing, can you take a photo of the output of lspci
command so I can see what hardware you are running. Thanks.
I havent been able to run a memory test as the device doesn't have the normal functions I'm excepting. I did run it on another device and it went through. I'm gonna test the other drive on the new hardware and test the 1st drive on the original hardware to see if it will stop again. Might be a hardware issue that somehow didn't come up before
Oh also, the nomodeset did nothing.
well I came back in this morning to check on my drives wiping and the drive that just fully wiped fine on the other hardware failed on the previous hardware. So something about the Datto Alto 3 v2 has problems with wiping drives on the newest version of ShredOS. It did not appear to have any issues on the previous version but now it does on certain drives. Back to the drive station drawing board for me.
It's still possible that particular Datto Alto 3 v2 has an intermittent issue and another Datto Alto 3 v2 works fine. They are pretty cheap on eBay so you could be throwing good money after bad but then it's only £40 so maybe worth taking the gamble.
Well I do have at least another one laying around. Ill give it a go just to see.
Well I have tested another identical device and it also failed. So evidently I cannot use the Datto Alto 3 v2 as my hardware, Unless in the very rare case both of my boxes are broken in that way. I will start looking into setting up a new station with this in mind.
Thought I would throw this up here just in case it was something but I have been trying to wipe a couple drives and it has been freezing at random intervals and requires a full reboot. This is the only log file that is updating in the files. dmesg.txt It boots with this the first time usually although I think the archive_log part at the end was just required to make the folder. I was more looking at the first line. I have been trying to change the drive that is wiping to see if that is the issue and so far there hasnt been a difference it still locks up at some point. Going to try some different configurations and maybe a different boot drive to test it a few times.