martijnvanbrummelen / nwipe

nwipe secure disk eraser
GNU General Public License v2.0
631 stars 71 forks source link

Status display showing wrong percentages when scrolling #544

Open mrcmnk opened 5 months ago

mrcmnk commented 5 months ago

Hi,

I just noticed that there seems to be a bug when wiping a lot of drives (24 in my case) at the same time and when scrolling in the GUI is enabled. I am using ShredOS v2023.08.2_25.0_x86-64_0.35.

Some of the drives show exactly a 10% higher percentage than others. When scrolling to the top or bottom, this offset wanders/moves to another drive, which now shows a 10% higher percentage of being finished. Please see attached video grabs (device sde for instance) - which were taken seconds apart.

Could someone confirm if they also found this to be a problem?

Thank you for providing us with a great piece of software for drive erase! I am new to nwipe, but it works very well for us.

Marco

gui3 gui2 gui1
PartialVolume commented 5 months ago

@mrcmnk That is an odd problem. How many drives are you wiping simultaneously? I'll load up a system and see if I can reproduce it.

@ggruber do you see that issue when wiping many drives?

mrcmnk commented 5 months ago

@PartialVolume 24 drives. All drives are now at ~40% and the problem has gone away. When scrolling up and down the values do not change anymore. So my guess is that it only occurs in a specific range.

I will start another batch of 24 in a few hours and I will see if I can reproduce it and under which circumstances.

PartialVolume commented 5 months ago

@mrcmnk If you are still wiping those same drives can you try something for me, do the same scrolling up and down like you did before but only when most of the drives reading 20-29% or 40-49% or 60-69% or 80-89%, so say you do the scrolling when most of the drives are at 60-69% do you then get some drives reading 70-79%.

What I'm thinking is that a single least significant bit that is stuck high in a memory location would produce this effect. Of course if it was it could be intermittent.

If it was a stuck bit, you wouldn't see the problem when the drives read 10-19% or 30-39% or 50-59% or 70-79% or 90-99% as the bit would be expected to be high for those values.

ggruber commented 5 months ago

which exact version of nwipe is this with the problem? the screen shot shows 0.35 but I habe 0.35.6 on test now. could try to verify with 24 disks or even on a 51 disks system later. And do we talk about display in a console window or in a ssh window? I guess the problem is seen on console? I physical one or from iDRAC/ILo/... ?

mrcmnk commented 5 months ago

@PartialVolume I did that, but the problem did not occour when all drives were in the 40% or 80% and 90% range.

@ggruber This is the version included with ShredOS, which I believe is 0.35, yes. This is with a physical console on a TFT display, no telnet/ssh or remote management solutions inbetween. A bit of other work got in the way, but I will get back with feedback on the 2nd run tomorrow.

The pictures in my initial post above were videograbs from a video I made. If it helps, I could upload that to YouTube as well.

ggruber commented 5 months ago

@mrcmnk: what is your displays physical resolution? 1280x1024? or ist a younger panel with higher / 16:9 res?

mrcmnk commented 5 months ago

@ggruber The display has a resolution of 1920x1200px.

I was not able to reproduce the behaviour in the 10% range as I expected with any of the following runs.

However I noticed that the output seems not to be updated anymore, once one of the drives has finished the wipe.

Also the scrolling problem which I described above occured all of the time, once the wipe threads were coming to an end. One of the lines even showed a 77.97% here (also while still showing "complete"), however the log output after nwiple closes shows that all drives were correctly wiped and the PDF files are also complete. Still, this may lead to a little bit of doubt for the user.

Please see: https://www.youtube.com/watch?v=NuWOtrhvoyQ

PartialVolume commented 5 months ago

@mrcmnk can you run an overnight memtest86 on that system. Thanks.

PartialVolume commented 5 months ago

@mrcmnk I can see the problem, it looks like only the percentage doesn't get updated on screen once a wipe has completed, the rest of the line is updated. I'll take a look at that section of code.

I would still run the memtest86 as this looks like possibly two different issues.

PartialVolume commented 5 months ago

@mrcmnk

image

I'm assuming that you are not doing 51 passes and that the 5 here is incorrect, certainly in the code there is always a space after of so the fact a random 5 is there doesn't make any sense when you look at nwipe's code, it should be impossible for that to occur unless there is also a bug in ncurses or the terminal.

PartialVolume commented 5 months ago

@mrcmnk That random 5 or 6 or 4 is actually the last digit of the throughput in MB/s that displayed in the same column in the drive above or below depending upon which way your are scrolling. nwipe writes an entire line in one go not individual characters, that's down to ncurses to handle screen display. This doesn't feel like a bug in nwipe but a bug elsewhere. I need to dig a bit deeper into this.

I would definitely run an overnight memtest86 on this system. If for no other reason than to rule out some flaky memory addressing issue in hardware.

mrcmnk commented 5 months ago

Thank you for checking - will test the memory and will update you once done.

PartialVolume commented 5 months ago

If memtest86 doesn't detect any errors could you post the output of lspci so I can see what graphics hardware you have.

Are you booting ShredOS vanilla, i.e no options like nomodeset on the kernel command line.

I've done a preliminary check with 20 drives and I'm not seeing any screen corruption, incorrect percentages or lines that don't update as I rapidly scroll up and down as the drives complete. My preliminary check was with the latest version with nomodeset on the kernel command line. But I want to try to duplicate exactly your setup as far as possible for the next tests I do.

ggruber commented 5 months ago

@PartialVolume I uploaded a short video the my nextcloud (the link you have had for a while). You can see one line flickering while others update more smootly. This is beeing displayed in an Putty/ssh window over a LAN connection and shall only demonstrate, that there might be issues in the update mechanism. It seems, not the complete screen is redrawn, but singular lines. And it might be that these updates are not distributed evenly among all the lines that change their content within a timespan of say 1/10th of a second. I mean one line gets updates much more often than the others (my impression from the flickering seen in the video). If this is possible in one direction why not in the other, too? The result would then be not updated lines, as described before.

PartialVolume commented 5 months ago

@ggruber If got the link to the nwipe server .....edv2g.de via ssh. I don't remember a nextcloud link?

ggruber commented 5 months ago

well there are a couple of videos on the nextcloud where I could show you some screen update problems on pre-0.35 release, which lead to the threaded screen updates or temperature reading or what was it? I'll sent you the link with PM again

PartialVolume commented 5 months ago

This is beeing displayed in an Putty/ssh window over a LAN connection and shall only demonstrate, that there might be issues in the update mechanism. It seems, not the complete screen is redrawn, but singular lines.

Sounds to me, more like a buggy terminal software. Do other Linux based terminals like tmux, terminology, terminator, kitty, guake, stterm, st, rxvt-unicode or the konsole based terminals based on qtermwidget such as Cool Retro Term, show the same problem? See #305 for a bug that was found in the Konsole based terminal Cool Retro Term back in 2021.

I'll sent you the link with PM again

Did you sent the link? @partialvolume:matrix.org

image

ggruber commented 5 months ago

just tried to use the link and start a conversation

ggruber commented 5 months ago

besides putty I could try mobaxterm. Or a "kind of physical display" for a native console on Reritan IP KVM. Or with thelia a native console. The latter both only with 1280x1024 res. Or I could try Linux Apps from an WSL Ubuntu.

mrcmnk commented 4 months ago

If memtest86 doesn't detect any errors could you post the output of lspci so I can see what graphics hardware you have.

Are you booting ShredOS vanilla, i.e no options like nomodeset on the kernel command line.

I've done a preliminary check with 20 drives and I'm not seeing any screen corruption, incorrect percentages or lines that don't update as I rapidly scroll up and down as the drives complete. My preliminary check was with the latest version with nomodeset on the kernel command line. But I want to try to duplicate exactly your setup as far as possible for the next tests I do.

Hi,

sorry for the delay. I now ran memtest86 and the system passed all tests!

Yes, I can confirm that I use an unmodified version of ShredOS - just like you download it from the website. Zero changes.

Please see the output of lspci. I am using the IGP of the Intel E-2174G CPU.

lspci

PartialVolume commented 4 months ago

@mrcmnk I suspect it may be the Intel DRM driver. Intel have been doing a a load of work on the driver and have only recently finished. Unfortunately this kernel contains a buggy driver.

There is a way of proving this. Depending on how you boot ShredOS, either use the latest nomodeset .iso file from the latest release or if you use DD or rufus to make a USB stick append nomodeset to the kernel command line in both /EFI/BOOT/grub.cfg and /boot/grub/grub.cfg.

The next release of ShredOS will contain the updated Intel DRM drivers so I'm hoping we'll have a lot less graphics issues.

mrcmnk commented 4 months ago

Okay, thank you. I will try the new version when it comes out and update then.