Documentation: Update Readme.md with information regarding ssd media

fthobe commented 3 weeks ago

As bespoken in #587 the documentation lacks clarity regarding the secure cancellation of data from SSD media.

fthobe commented 3 weeks ago

@PartialVolume Please take a look at:

PartialVolume commented 2 weeks ago

I've read through the documents you linked to and took a look at the commit.

Looks good, couple of points though. Somebody reading it might think just using the manufacturers tool might be adequate enough and maybe it would depending on what risk is acceptable. However, we know that's not the case from past experience with buggy drive firmware on SATA drives hence why we suggest a multifaceted approach of secure erase, prng wipe and secure erase.

Also is there any verification process with these methods or do we just trust the sanitise command did what it was supposed to do? At least with prng stream and verification we know that at the very minimum x Gbytes of data has well and truly been wiped as the drive has to reproduce the stream precisely for verification.

This is just my opinion and maybe I'm going to start sounding paranoid but....

Regarding AES 256 internally symmetrically encrypted drives. In terms of using the firmware to wipe the drive but instead of actually wiping the encrypted data, the internal key is deleted and the drive firmware recreates a new key making the data unreadable (unless you know the key)

While AES 256 is believed to be unbreakable, leaving the encrypted data on the drive seems to me just asking for other attack vectors to be attempted that are focused on obtaining or knowing the key before it's deleted. For instance like altering drive firmware in advance of deployment so the drive always uses a known key, so once the drives are decommissioned and discarded with their fully intact encrypted data the person/organisation/country with malicious intend simply needs to obtain those drives as they already know the keys to use. Just seems odd to me why you wouldn't completely wipe the drive leaving nothing, not even encrypted data. Who would gain from the idea that's it ok to leave data on the drive even when it's encrypted 😉 That said at least with the Samsung drives they do give you the choice to electrically erase the blocks, but what does that mean? Are they just erasing the block pointers, are the block bits and bytes still intact, much like deleting a file in a directory is just deleting the pointer to where the file starts. How do you even verify that? There's too many unknowns, hence why I'd always follow up a secure erase with a prng stream. Hopefully the over provisioned memory will get wiped by the secure erase and the prng stream definitely wiped the blocks we can access.

@fthobe @Knogle @Firminator @mdcato just wondered whether you had any thoughts on this. Am I being paranoid or is too much faith being put in AES 256 because it's apparently unbreakable while the attacks on the key and firmware could mean the data can be recovered.

Knogle commented 2 weeks ago

To be honest, I won't trust these tools either. I had once a OCZ drive, and used the secure erase option, but I had a buggy firmware, and the data remained unchanged after the process, even after attempting several times. My approach to go for would be similar to yours. Running secure erase, and PRNG stream or vise versa.

mdcato commented 2 weeks ago

There’s the phrase “never say never”, which I apply to “While AES 256 is believed to be unbreakable…”. The following is a bit of a diversion, but see “NIST finalizes trio of post-quantum encryption standards” https://www.theregister.com/2024/08/14/nist_postquantum_standards/ And in my view leaving the encrypted data does ask for another attack vector. Also, the idea of implementing & maintaining a 2nd & 3rd entropy source is going a little far afield of nwipe’s focus. Yes the entropy is important, so how about a sanity check of /dev/urandom that throws a fatal error. Bringing too much code in from other sources requires frequent checking of those sources for their changes as well as the ongoing changes/improvements needed for nwipe. It sounds too much like the problem of using unmaintained code/libraries whose bugs persists for decades.

From: PartialVolume @.> Sent: Friday, August 23, 2024 17:57 To: martijnvanbrummelen/nwipe @.> Cc: Mike Cato / Hays Technical Services @.>; Mention @.> Subject: Re: [martijnvanbrummelen/nwipe] Documentation: Update Readme.md with information regarding ssd media (Issue #590)

I've read through the documents you linked to and took a look at the commit.

Looks good, couple of points though. Somebody reading it might think just using the manufacturers tool might be adequate enough and maybe it would depending on what risk is acceptable. However, we know that's not the case from past experience with buggy drive firmware on SATA drives hence why we suggest a multifaceted approach of secure erase, prng wipe and secure erase.

Also is there any verification process with these methods or do we just trust the sanitise command did what it was supposed to do? At least with prng stream and verification we know that at the very minimum x Gbytes of data has well and truly been wiped as the drive has to reproduce the stream precisely for verification.

This is just my opinion and maybe I'm going to start sounding paranoid but....

Regarding AES 256 internally symmetrically encrypted drives in terms of using the firmware to wipe the drive but instead of actually wiping the encrypted data the internal key is deleted and recreates a new key making the data unreadable (unless you know the key)

While AES 256 is believed to be unbreakable, leaving the encrypted data on the drive seems to me just asking for other attack vectors to be attempted that are focused on obtaining or knowing the key before it's deleted. For instance like altering drive firmware in advance of deployment so the drive always uses a known key, so once the drives are decommissioned and discarded with their fully intact encrypted data the person/organisation/country with malicious intend simply needs to obtain those drives as they already know the keys to use. Just seems odd to me why you wouldn't completely wipe the drive leaving nothing, not even encrypted data. Who would gain from the idea that's it ok to leave data on the drive even when it's encrypted 😉 That said at least with the Samsung drives they do give you the choice to electrically erase the blocks, but what does that mean? Are they just erasing the block pointers, are the block bits and bytes still intact, much like deleting a file in a directory is just deleting the pointer to where the file starts. How do you even verify that? There's too many unknowns, hence why I'd always follow up a secure erase with a prng stream. Hopefully the over provisioned memory for wiped by the secure erase and the prng stream definitely wiped the blocks we can access.

@fthobehttps://github.com/fthobe @Knoglehttps://github.com/Knogle @Firminatorhttps://github.com/Firminator @mdcatohttps://github.com/mdcato just wondered whether you had any thoughts on this. Am I being paranoid or is too much faith being put in AES 256 because it's apparently unbreakable while the attacks on the key and firmware could mean the data can be recovered.

— Reply to this email directly, view it on GitHubhttps://github.com/martijnvanbrummelen/nwipe/issues/590#issuecomment-2307903790, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ANGK2PQT2HPBY232ANVMTDDZS645BAVCNFSM6AAAAABM7T5WMOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMBXHEYDGNZZGA. You are receiving this because you were mentioned.Message ID: @.**@.>>

fthobe commented 2 weeks ago

I feel like paranoid is the way to go on this kind of activity.

Regarding the reliability of AES256: It's considered a standard with limited survivability as of today (NIST estimates ~2050). Given that we will see plenty of disks reaching twenty years in the future as it seems spinning rust for large data sets is here to stay and sata / sas has reached peak development (I assume that we gonna see the physical interfaces unchanged for twenty years to come) we can assume that plenty of disks will have a second and third life (either in enterprise or consumer space) given the increasing circular economy. I would not put faith in that standard.

Somebody reading it might think just using the manufacturers tool might be adequate enough and maybe it would depending on what risk is acceptable. However, we know that's not the case from past experience with buggy drive firmware on SATA drives hence why we suggest a multifaceted approach of secure erase, prng wipe and secure erase.

More warning and disclaimer regarding issues with reliability, ok.

Also is there any verification process with these methods or do we just trust the sanitise command did what it was supposed to do? At least with prng stream and verification we know that at the very minimum x Gbytes of data has well and truly been wiped as the drive has to reproduce the stream precisely for verification.

While nvme-cli is open source, all other clients for sas / sata are not, there is literally no way to verify how the destruction is verified. I can outline that as well.

@Knogle

To be honest, I won't trust these tools either. I had once a OCZ drive, and used the secure erase option, but I had a buggy firmware, and the data remained unchanged after the process, even after attempting several times.

The issue is that all secure erase and sanitize commands are optional: manufacturers do not have to implement it to produce a sas / sata / nvme compliant drive, I have the impression that it's frequently a best effort and maintenance is spotty on drive firmwares.

fthobe commented 2 weeks ago

@Knogle @PartialVolume updated my fork, please take a look.

Firminator commented 1 week ago

@fthobe @Knogle @Firminator @mdcato just wondered whether you had any thoughts on this. Am I being paranoid or is too much faith being put in AES 256 because it's apparently unbreakable while the attacks on the key and firmware could mean the data can be recovered.

I agree with Mdcato's statement wholeheartedly:

Also, the idea of implementing & maintaining a 2nd & 3rd entropy source is going a little far afield of nwipe’s focus. ...Bringing too much code in from other sources requires frequent checking of those sources for their changes as well as the ongoing changes/improvements needed for nwipe. It sounds too much like the problem of using unmaintained code/libraries whose bugs persists for decades.

Might be a bit offtopic since I can't find the threads anymore, but for random wiping we should probably use the recommended /dev/urandom generated by the kernel instead of userspace openssl ( reference: https://sockpuppet.org/blog/2014/02/25/safely-generate-random-numbers/ ).

I haven't followed much the recent crypto development in nwipe although I'm interested, but I lost track of what has been done so far and how. If OpenSSL is used by nwipe (again I'm not sure if it is or not) then the LTS version 3.0.x would be the candidate I choose over OpenSSL 3.1 or 3.2 to decrease regression risk and the burden of maintaining and updating the OpenSSL libraries all the time. LTS version usually only gets updated to fix CVEs. No new features are introduced. Specifically the OpenSSL LTS version is supported until 3rd quarter of 2026. Now Idk what version is included in ShredOS but that could be controlled by when ShredOS is built with buildroot, but if nwipe is used on the user's choice of OS we have to rely on whatever version of OpenSSL it has installed. I've seen numerous EOL OpenSSL v1.x.x versions living on a never updated host OS. It would be detrimental to rely on user installed 3rd-party components like OpenSSL.

PartialVolume commented 1 week ago

@Firminator good to hear from you again.

I have much the same concerns as you regarding introducing another library (Openssl) into the code and trying to think of and mitigate against the failures that could occur. Hence why they are still at the PR stage. I'm not totally against it, just wary about creating a rod for my own back. Xoroshiri-256 and Fibonacci prngs have no external libraries, all code is contained within nwipe so the current master does not include Openssl. My preference was to have AES 256 code built into nwipe which I discussed with @Knogle but the work required was apparently too difficult or too time consuming.

That was a very interesting article about using /dev/urandom and avoiding user space libraries to generate seeds. I'd like to hear how @Knogle feels about that article and avoiding Openssl as a replacement for /dev/urandom.

https://sockpuppet.org/blog/2014/02/25/safely-generate-random-numbers/

And this, about the Debian/Openssl fiasco -> https://research.swtch.com/openssl

Knogle commented 1 week ago

I hope you're doing well @PartialVolume! :)

Regarding AES CTR implementation, it's incredibly challenging to get it right. The AES implementation from OpenSSL, however, is highly reliable, extensively tested, and optimized for performance. It's widely used, including for LUKS encryption, OpenVPN etc., and supports a range of platforms such as ARM and x86, so I don't foresee any issues there. We have also created a fail-safe approach, by testing the first generated AES data for it's quality.

OpenSSL also offers additional functionalities, like improving how seeds are handled by integrating more secure methods, such as using an RSA key exchange algorithm. This ensures that the seeds are securely stored in memory, preventing the "decryption" of random data. Any potential issues with OpenSSL won't impact the AES-based PRNG, as the randomness comes from the truly "random" source /dev/urandom. OpenSSL simply uses AES to generate a stream based on that seed. Regarding the OpenSSL version, i'd also go for the LTS release.

As for the seeds, I agree that /dev/urandom is generally considered a crypto-secure PRNG, so we can stick with it. Changing how seeds are generated would require significant changes to the structure and code of nwipe, which could introduce new bugs or issues.

@PartialVolume Even if /dev/urandom is our only entropy, what do you think about a verification process for enough randomness like here https://github.com/martijnvanbrummelen/nwipe/pull/594 ? I just wonder if there is a proper way to handle an error, instead of just bailing out.

PartialVolume commented 1 week ago

@knogle I'm good, thanks and thanks for the comments.

Yes, I think the entropy verification is a good idea. How are you currently exiting on entropy failure? To have nwipe exit gracefully by closing all the threads, writing the logs etc. Put this line in the file, outside the function

extern int terminate_signal;

and then

terminate_signal=1;

Will trigger a gracefull shutdown.

PartialVolume commented 1 week ago

It was interesting reading about that race condition re /dev/urandom mentioned in https://sockpuppet.org/blog/2014/02/25/safely-generate-random-numbers/. Reminded me of the problem you had with Fedora? Just wondering if there is a way of determining whether /dev/urandom has enough entropy before it is used. In the link above ..

Quote: "This is indeed a problem with urandom (and not /dev/random) on Linux. It’s also a bug in the Linux kernel. But it’s also easily fixed in userland: at boot, seed urandom explicitly. Most Linux distributions have done this for a long time. But don’t switch to a different CSPRNG."

How do you go about seeding urandom, if you're unsure whether the distro does it?

PartialVolume commented 1 week ago

Bearing in mind, the document https://factorable.net/weakkeys12.extended.pdf is from 2012. So maybe not an issue anymore.

Knogle commented 1 week ago

Thanks a lot! I have made those changes in the PR 594. First the entropy of urandom will be checked, later on , the same function is used to also check the randomness for the AES number generation. The bug I encountered was Kernel related. Also an issue that's quite common: After going to suspend, and resuming, the entropy of urandom may also be rubbish. So it's crucial to check the entropy regardless of the PRNG in use. If we go further it would be possible to implement this approach for all PRNGs globally in method.cinstead.

PartialVolume commented 1 week ago

The bug I encountered was Kernel related. Also an issue that's quite common: After going to suspend, and resuming, the entropy of urandom may also be rubbish. So it's crucial to check the entropy regardless of the PRNG in use.

That's useful to know as suspend and resuming is something nwipe will be doing to unfreeze drives (if frozen) so a secure erase can be initiated. So in such a case I read that you need to write to /dev/urandom to get it to work properly?

If we go further it would be possible to implement this approach for all PRNGs globally in method.c instead.

Yes, good idea.

fthobe commented 1 week ago

Pfff... gone 5 days and you people turn my ticket into a PRNG discussion :D

Has anybody had the time to check these two edits so that I can make a pull request?

Updated readme.md Added guide for SSD

PartialVolume commented 1 week ago

@fthobe changes look good, go ahead and issue a PR. Thanks.

fthobe commented 1 week ago

@fthobe changes look good, go ahead and issue a PR. Thanks.

Done

martijnvanbrummelen / nwipe

Documentation: Update Readme.md with information regarding ssd media #590