Implement High-Quality Random Number Generation Using AES-CTR Mode with OpenSSL and AES-NI Support

Knogle commented 5 months ago

In this pull request, I present my implementation of a pseudo-random number generator (PRNG) utilizing the AES-CTR (Advanced Encryption Standard - Counter mode) in 128-bit mode. This implementation is designed to produce high-quality random numbers, which are essential for a wide range of cryptographic applications. By integrating with the OpenSSL library and exploiting AES-NI (Advanced Encryption Standard New Instructions) hardware acceleration when available, I ensure both the security and efficiency of the random number generation process. It provides the highest-quality of PRNGs yet for NWIPE, and is a CSPRNG.

Key Features:

AES-CTR Mode: I chose AES in Counter mode due to its renowned capability to generate secure and unpredictable pseudo-random sequences. This mode operates by encrypting incrementing counter values, with the encryption output serving as the stream of random bytes.

128-bit AES: Utilizing a 128-bit key size for AES encryption provides a strong security measure while maintaining efficient performance, adhering to current cryptographic standards for pseudo-random number generation.

Integration with OpenSSL: OpenSSL, being a well-established and rigorously tested cryptographic library, is used to manage AES operations. This integration ensures a high level of security and performance for the AES-CTR operations within our PRNG.

Leveraging AES-NI Support: My implementation automatically detects and utilizes AES-NI, a set of instructions that enhance AES operations on most modern processors. This feature significantly improves the speed of random number generation, reducing CPU usage and enhancing scalability.

Implementation Details:

Initialization: At the outset, the PRNG's state is initialized with a distinct 128-bit key and an initial counter value, using OpenSSL's AES_set_encrypt_key to prepare the AES key structure for subsequent operations.

Generating Random Numbers: For generating random numbers, the current counter value is encrypted under the configured AES key in CTR mode. The output of this encryption serves as the source of pseudo-random bytes, with the counter incremented after each operation to maintain the uniqueness of subsequent inputs.

State Management: The PRNG's internal state, including the AES key, counter (IV), and encryption buffer (ecount), is securely managed within an aes_ctr_state_t structure. This careful management is crucial for preserving the integrity and unpredictability of the random number stream.

Optimizing for Hardware: By optimizing for AES-NI, my implementation ensures enhanced performance through hardware acceleration, providing an efficient solution for generating random numbers across various applications.

This PRNG implementation stands as a robust and efficient tool for generating high-quality pseudo-random numbers, crucial for cryptographic operations, secure communications, and randomized algorithms. The combination of AES-CTR mode, OpenSSL's reliability, and the performance benefits of AES-NI hardware acceleration results in a superior random number generator.

I have ensured that the implementation is well-documented with clear comments, making it accessible for review, understanding, and maintenance, following best practices in both software development and cryptographic standards.

I look forward to receiving feedback on this pull request to further improve and ensure the effectiveness of the PRNG implementation.

Test of randomness: 54e9585c-0218-4a40-be46-7911db900e0b

c860977f-8f4a-4015-ae21-1ae074824db6

Mean frequency per byte value: Approximately 100,289
Variance: Approximately 92,612
Standard deviation: Approximately 304

NIST Test Suite:

A total of 188 tests (some of the 15 tests actually consist of multiple sub-tests)
were conducted to evaluate the randomness of 32 bitstreams of 1048576 bits from:

    /dev/loop0

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

The numerous empirical results of these tests were then interpreted with
an examination of the proportion of sequences that pass a statistical test
(proportion analysis) and the distribution of p-values to check for uniformity
(uniformity analysis). The results were the following:

    188/188 tests passed successfully both the analyses.
    0/188 tests did not pass successfully both the analyses.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Here are the results of the single tests:

 - The "Frequency" test passed both the analyses.

 - The "Block Frequency" test passed both the analyses.

 - The "Cumulative Sums" (forward) test passed both the analyses.
   The "Cumulative Sums" (backward) test passed both the analyses.

 - The "Runs" test passed both the analyses.

 - The "Longest Run of Ones" test passed both the analyses.

 - The "Binary Matrix Rank" test passed both the analyses.

 - The "Discrete Fourier Transform" test passed both the analyses.

 - 148/148 of the "Non-overlapping Template Matching" tests passed both the analyses.

 - The "Overlapping Template Matching" test passed both the analyses.

 - The "Maurer's Universal Statistical" test passed both the analyses.

 - The "Approximate Entropy" test passed both the analyses.

 - 8/8 of the "Random Excursions" tests passed both the analyses.

 - 18/18 of the "Random Excursions Variant" tests passed both the analyses.

 - The "Serial" (first) test passed both the analyses.
   The "Serial" (second) test passed both the analyses.

 - The "Linear Complexity" test passed both the analyses.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

The missing tests (if any) were whether disabled manually by the user or disabled
at run time due to input size requirements not satisfied by this run.

SmallCrush Test:

========= Summary results of SmallCrush =========

 Version:          TestU01 1.2.3
 Generator:        ufile_CreateReadBin
 Number of statistics:  15
 Total CPU time:   00:00:06.46

 All tests were passed

Screenshot from 2024-03-24 03-35-36

PartialVolume commented 4 months ago

Update: I've updated your two feature branches (PRs) but the following commands might be useful if I make more commits to the upstream master.

Hi, can you update your master, and also branches aes-ctr and aes-ctr-cpp-smartptr, thanks. It will help me with some of the valgrind debugging if your pull requests are up to date with the base branch.

Probably something like the following should work, we update your local master branch first, then merge the updated master branch with the two feature branches aes-ctr and aes-ctr-cpp-smartprt and push them both to the branches on Github. My apologies if you already know these commands, or if I've forgotten anything, but hopefully should work.

Switch to your master branch

git checkout master

Update your master branch

git pull upstream master

Push you now updated local master branch to your github master

git push origin master

Switch to your aes-ctr branch

git checkout aes-ctr

Merge your now updated master branch into your aes-ctr branch

git merge master

Push your local aes-ctr branch to your aes-ctr branch on Github.

git push origin aest-ctr

Now we switch to the aes-ctr-cpp-smartptr branch

git checkout aes-ctr-cpp-smartptr

Merge your now updated master branch into your aes-ctr-cpp-smartptr branch

git merge master

Push your local aes-ctr-cpp-smartptr branch to your aes-ctr-cpp-smartptr branch on Github.

git push origin aes-ctr-cpp-smartptr

Hopefully everything should now be up to date with the master.

PartialVolume commented 4 months ago

Actually I updated the feature branches, so you may just want to update your master branch. But the commands may be useful just in case I add some more commits to the master branch.

Knogle commented 4 months ago

Alright :) So this branch is up to date now?

PartialVolume commented 4 months ago

Alright :) So this branch is up to date now?

Yes, all good.

PartialVolume commented 4 months ago

I'll give you an update on progress probably tomorrow night.

PartialVolume commented 4 months ago

The problem seems to be that the stack gets corrupted by aes_ctr_prng.cpp:126, according to valgrind. This seems to happen on thread 4 or 5 of a 16 drive wipe (16 threads). This seems to then cause the calloc error, triggers the write of uninitialised buffer and the conditional jump on initialised values. I'm assuming that this causes the prng numbers to be incorrect in some way, although I can't prove that. So it looks like something in aes_ctr_prng_genrand_uint128_to_buf() is messing up the stack intermittently. I think this same problem exists in the C code, so the Cpp & smartptr aren't fixing this particular issue. Just for reference these results were from the smartptr branch.

Regarding the C++ code, I need to discuss with the other guys whether we want to introduce c++ into nwipe, so far being only C. I know the smartptr solves an issue with cleanup but it may be that with the C code we do the cleanup at the end of the random pass function and also at the end of the random verification function. I'll get back to you on that.

Could EVP_EncryptUpdate() be writing beyond the end of temp_buffer?

==1413416== Conditional jump or move depends on uninitialised value(s)
==1413416==    at 0x11BC60: nwipe_random_pass (pass.c:330)
==1413416==    by 0x1206AF: nwipe_runmethod (method.c:934)
==1413416==    by 0x1218AC: nwipe_random (method.c:742)
==1413416==    by 0x52AF133: start_thread (pthread_create.c:442)
==1413416==    by 0x532EA3F: clone (clone.S:100)
==1413416==  Uninitialised value was created by a stack allocation
==1413416==    at 0x11ED60: aes_ctr_prng_genrand_uint128_to_buf (aes_ctr_prng.cpp:126)
==1413416== 
==1413416== Syscall param write(buf) points to uninitialised byte(s)
==1413416==    at 0x531E27F: __libc_write (write.c:26)
==1413416==    by 0x531E27F: write (write.c:24)
==1413416==    by 0x11BB44: nwipe_random_pass (pass.c:349)
==1413416==    by 0x1206AF: nwipe_runmethod (method.c:934)
==1413416==    by 0x1218AC: nwipe_random (method.c:742)
==1413416==    by 0x52AF133: start_thread (pthread_create.c:442)
==1413416==    by 0x532EA3F: clone (clone.S:100)
==1413416==  Address 0x6c52700 is 0 bytes inside a block of size 4,096 alloc'd
==1413416==    at 0x48455EF: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==1413416==    by 0x11BA9E: nwipe_random_pass (pass.c:268)
==1413416==    by 0x1206AF: nwipe_runmethod (method.c:934)
==1413416==    by 0x1218AC: nwipe_random (method.c:742)
==1413416==    by 0x52AF133: start_thread (pthread_create.c:442)
==1413416==    by 0x532EA3F: clone (clone.S:100)
==1413416==  Uninitialised value was created by a stack allocation
==1413416==    at 0x11ED60: aes_ctr_prng_genrand_uint128_to_buf (aes_ctr_prng.cpp:126)

Knogle commented 4 months ago

Ahoy! So as I understood, the issues with verify still exist? Could you do a run on loop devices, just for comparison?

If the issue is the same, I think we can try and proceed with the C-only branch again and continue troubleshooting there.

EDIT: Thank you for sharing these Valgrind results; they seem genuinely useful. I'm starting to think that the issue might not directly stem from the PRNG itself, but rather how nwipe manages memory allocations, function calls, etc., specifically for this PRNG. Could it also involve nwipe's implementation of pthreads? Although the C code appears correct when focusing solely on the implementation, this perspective might not encompass the entire picture. Additionally, the fact that I cannot replicate the issue might also aid in pinpointing the problem, suggesting the possibility of architecture-specific factors. Hence, it could indeed be related to pthreads. Do you think providing a binary could help, especially if it's related to the version of the libraries involved

PartialVolume commented 4 months ago

So as I understood, the issues with verify still exist?

Yes, the verify still fails. Out of 16 drives, 2 or 3 fail verification, same result from both branches.

I wouldn't have thought it was pthreads, when this occurs it only occurs in one a thread, each thread having it's own stack, so most of the wipes work ok as they are using their own separate stack.

It sort of feels like somewhere the wrong size of a variable is being passed between functions, because a function .declaration is wrong, but then you would expect the compiler to pick that up.

It may be worth using the following to switch on more compiler warnings, to see if that throws any light on it.

./configure --prefix=/usr CFLAGS='-O0 -g -Wall -Wextra -fstack-protector-strong -Wformat -Werror=format-security'

PartialVolume commented 4 months ago

Although I have run that particular configure and although it does show plenty of warnings it doesn't show anything related to aes-ctr, and the code runs as before, i.e no stack smashing errors detected at run time. So probably no help.

Knogle commented 4 months ago

Thanks a lot for your testing! Could you test on the same machine, maybe a run with 16 loop devices? This one regarding the buffer of the aes-ctr function should be fixed now, had a buffer overflow in my code.

==1413416==  Uninitialised value was created by a stack allocation
==1413416==    at 0x11ED60: aes_ctr_prng_genrand_uint128_to_buf (aes_ctr_prng.cpp:126)

PartialVolume commented 4 months ago

Thanks a lot for your testing! Could you test on the same machine, maybe a run with 16 loop devices? This one regarding the buffer of the aes-ctr function should be fixed now, had a buffer overflow in my code.
==1413416==  Uninitialised value was created by a stack allocation
==1413416==    at 0x11ED60: aes_ctr_prng_genrand_uint128_to_buf (aes_ctr_prng.cpp:126)

Yes, will do. I'll do the loop drives this evening along with your latest aes-ctr branch.

Knogle commented 4 months ago

Thanks a lot for your testing! Could you test on the same machine, maybe a run with 16 loop devices? This one regarding the buffer of the aes-ctr function should be fixed now, had a buffer overflow in my code.
==1413416==  Uninitialised value was created by a stack allocation
==1413416==    at 0x11ED60: aes_ctr_prng_genrand_uint128_to_buf (aes_ctr_prng.cpp:126)
Yes, will do. I'll do the loop drives this evening along with your latest aes-ctr branch.

aes-ctr branch is still unchanged, please test with aes-ctr-cpp-smartptr. Thanks a lot, and happy testing heh!

EDIT:

What's worth noting, a difference i have encountered between running AES-CTR, and other PRNGs is this. Those 'Conditional jump or move depends on uninitialised value(s)' don't appear when using Xoroshiro256.

==8774== Memcheck, a memory error detector
==8774== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==8774== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info
==8774== Command: src/nwipe /dev/loop0
==8774== Parent PID: 8773
==8774== 
==8774== Thread 4:
==8774== Conditional jump or move depends on uninitialised value(s)
==8774==    at 0x4118F1: nwipe_random_pass (pass.c:330)
==8774==    by 0x41604C: nwipe_runmethod (method.c:934)
==8774==    by 0x417179: nwipe_random (method.c:742)
==8774==    by 0x52A1896: start_thread (in /usr/lib64/libc.so.6)
==8774==    by 0x53288C3: clone (in /usr/lib64/libc.so.6)
==8774== 
==8774== Syscall param write(buf) points to uninitialised byte(s)
==8774==    at 0x531BF1D: write (in /usr/lib64/libc.so.6)
==8774==    by 0x4117D0: nwipe_random_pass (pass.c:349)
==8774==    by 0x41604C: nwipe_runmethod (method.c:934)
==8774==    by 0x417179: nwipe_random (method.c:742)
==8774==    by 0x52A1896: start_thread (in /usr/lib64/libc.so.6)
==8774==    by 0x53288C3: clone (in /usr/lib64/libc.so.6)
==8774==  Address 0x56c73e0 is 0 bytes inside a block of size 4,096 alloc'd
==8774==    at 0x4849E60: calloc (vg_replace_malloc.c:1595)
==8774==    by 0x41172E: nwipe_random_pass (pass.c:268)
==8774==    by 0x41604C: nwipe_runmethod (method.c:934)
==8774==    by 0x417179: nwipe_random (method.c:742)
==8774==    by 0x52A1896: start_thread (in /usr/lib64/libc.so.6)
==8774==    by 0x53288C3: clone (in /usr/lib64/libc.so.6)
==8774== 
==8774== Conditional jump or move depends on uninitialised value(s)
==8774==    at 0x484E98E: bcmp (vg_replace_strmem.c:1229)
==8774==    by 0x4114BB: nwipe_random_verify (pass.c:198)
==8774==    by 0x416206: nwipe_runmethod (method.c:961)
==8774==    by 0x417179: nwipe_random (method.c:742)
==8774==    by 0x52A1896: start_thread (in /usr/lib64/libc.so.6)
==8774==    by 0x53288C3: clone (in /usr/lib64/libc.so.6)
==8774== 
==8774== 
==8774== HEAP SUMMARY:
==8774==     in use at exit: 285,696 bytes in 584 blocks
==8774==   total heap usage: 8,768 allocs, 8,184 frees, 2,597,446 bytes allocated
==8774== 
==8774== LEAK SUMMARY:
==8774==    definitely lost: 31 bytes in 3 blocks
==8774==    indirectly lost: 8,475 bytes in 20 blocks
==8774==      possibly lost: 913 bytes in 12 blocks
==8774==    still reachable: 276,277 bytes in 549 blocks
==8774==         suppressed: 0 bytes in 0 blocks
==8774== Rerun with --leak-check=full to see details of leaked memory
==8774== 
==8774== Use --track-origins=yes to see where uninitialised values come from
==8774== For lists of detected and suppressed errors, rerun with: -s
==8774== ERROR SUMMARY: 65664005 errors from 3 contexts (suppressed: 0 from 0)

Knogle commented 4 months ago

Thanks a lot for your testing! Could you test on the same machine, maybe a run with 16 loop devices? This one regarding the buffer of the aes-ctr function should be fixed now, had a buffer overflow in my code.
==1413416==  Uninitialised value was created by a stack allocation
==1413416==    at 0x11ED60: aes_ctr_prng_genrand_uint128_to_buf (aes_ctr_prng.cpp:126)
Yes, will do. I'll do the loop drives this evening along with your latest aes-ctr branch.

I think i have promising updates now in the aes-ctr-cpp-smartptr branch :)

Knogle commented 4 months ago

I've added the same fix for the C-only aes-ctr branch. Also leads to a clean valgrind now.

Same uninitialized array here.

If you do some testing, please test the aes-ctr-cpp-smartptr branch first if possible.

unsigned char temp_buffer[32];  // Intermediate buffer for 256-bit pseudorandom output.
memset(temp_buffer, 0, sizeof(temp_buffer));

==25143== Memcheck, a memory error detector
==25143== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==25143== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info
==25143== Command: src/nwipe /dev/loop0 /dev/loop1 /dev/loop2 /dev/loop3 /dev/loop4 /dev/loop5 /dev/loop6 /dev/loop7 /dev/loop8 /dev/loop9 /dev/loop10 /dev/loop11 /dev/loop12 /dev/loop13 /dev/loop14 /dev/loop15
==25143== Parent PID: 25142
==25143== 
==25143== 
==25143== HEAP SUMMARY:
==25143==     in use at exit: 529,689 bytes in 1,082 blocks
==25143==   total heap usage: 28,073 allocs, 26,991 frees, 26,337,598 bytes allocated
==25143== 
==25143== LEAK SUMMARY:
==25143==    definitely lost: 26,992 bytes in 162 blocks
==25143==    indirectly lost: 186,627 bytes in 224 blocks
==25143==      possibly lost: 913 bytes in 12 blocks
==25143==    still reachable: 315,157 bytes in 684 blocks
==25143==         suppressed: 0 bytes in 0 blocks
==25143== Rerun with --leak-check=full to see details of leaked memory
==25143== 
==25143== For lists of detected and suppressed errors, rerun with: -s
==25143== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

PartialVolume commented 4 months ago

I'm currently testing the the C only branch. I may be jumping the gun, but it it's looking like initialising the 'data in' used by EVP_EncryptUpdate, i.e the temp_buffer with memset may have fixed not only the valgrind error but also the issue with the drives not intermittently verifying. I'll know more when it completes tomorrow.

PartialVolume commented 4 months ago

The aes-ctr branch passed with no verification errors on the 16 drive wipe. 🚀 Nice work.

We now need to deal with the cleanup function in C, I can help with that if you want some help with where that should go.

And finally before I merge I want to conduct further tests related to error handling. Because we are using a external library I want to makesure that ShredOS aborts the wipe with a appropriate error message if a external library function returns a failure, I e not just a log entry that a function returned a error and then carries on regardless. Because prng is critical to the wipe I want to makesure nwipe correctly detects and deals with errors that might be produced by openssl.

Knogle commented 4 months ago

Fantastic! This development is quite promising. I believe we no longer require the CPP branch. :) Interestingly, it turns out that the missing initialization was the root of the issue. This problem proved to be quite challenging to diagnose.

Moving forward, I will make some additional minor modifications to the code and commit them. It would be immensely helpful if you could identify an appropriate section in the code for implementing the cleanup procedure.

Furthermore, I plan to adjust the default PRNG (Pseudo Random Number Generator) settings. The system will now check for AES-Ni support in the CPU; lacking that, it will default to XORoshiro256.

PartialVolume commented 4 months ago

Furthermore, I plan to adjust the default PRNG (Pseudo Random Number Generator) settings. The system will now check for AES-Ni support in the CPU; lacking that, it will default to XORoshiro256.

Can you hold back that change for the time being, I would like to hear back from interested parties as to how they feel about making that the default prng. Despite it's effectiveness as a random number generator we haven't performed relative speed tests on a range of different hardware yet.

It maybe that speed is more important in this application. It will be interesting to see how others feel about this. @martijnvanbrummelen @Firminator @ggruber @mdcato etc.al

Knogle commented 4 months ago

Sure i will hold it back :) I think AES-CTR should perform far better on systems with AES-Ni available than XORoshiro256, together with the best PRNG quality. I get around 4000MB/s on my Ryzen system. I'm curious about your opinions :)

Do you like the current approach more, or do you prefer the error handling in aes-ctr-debug branch? https://github.com/Knogle/nwipe/blob/aes-ctr-debug/src/aes/aes_ctr_prng.c

EDIT:

Regarding lagged fibonacci, i've set up a Pentium 2 and Pentium 4 test system in order to perform tests on 32-bit systems. Is ShredOS intended to be used on those platforms as well?

EDIT2:

For speed testing i could offer the following platforms, also with SSH access for you:

Intel Xeon X5650 6-Core (64-Bit + AES-Ni) , MSI X58 Pro-E Motherboard, 3x 4GB 1600MHz memory Intel Pentium D 2-Core (64-Bit) on ASUS P5Q with 2x 2GB 800MHz memory. Intel Pentium 4 (32-Bit) on Tyan Tomcat XYZ motherboard with 4GB of memory.

The Pentium II for sure won't be able to handle a lot of stuff, so i'd like to boot into ShredOS directly with it, without offering SSH.

mdcato commented 4 months ago

@@.***>, My priorities are:

Data wiping effectiveness
Speed The factors I see are that processors supporting AES-NI are “recedingly” recent – mid-2010s, and the computers relegated to nwiping may be ones that were in the corner unused because they were older. That said, the non-AES-NI processors should be fewer and fewer now, so finding a “less-old” unused system should be easier. I’ve used a lowly Celeron J3160 released Q1’16 in a firewall appliance that supports AES-NI for VPN encryption; not all CPU models got the AES-NI instructions so users may have to check, or nwipe could warn that XORoshiro256 is going to be used.

I tend to load up the nwipe system I use with several drives, knowing that it will take days for the larger drives to complete – thus speed would be nice, but it simply takes time to wipe 4TB, 10TB & larger drives. This also leads to another question of whether to abort a drive’s thread on first error. My desire is that nwipe keep trying until the end of the drive, even if it means switching from blocks to sectors, and then add options so that the user can have it exit on the first error (which would be good for wiping with shredOS on a laptop’s single drive that would then force the decision to physically destroy the drive). This lets us see how bad a particular drive really is; was it just a single bad place or multiple ones from years of moving the system with the drive spinning & heads engaged. (BTW, drive prices are reportedly going up due to outfitting AI data centers, so there may be slightly increased pressure to re-use documented good used drives.)

From: PartialVolume @.> Sent: Wednesday, April 10, 2024 06:46 To: martijnvanbrummelen/nwipe @.> Cc: Mike Cato / Hays Technical Services @.>; Mention @.> Subject: Re: [martijnvanbrummelen/nwipe] Implement High-Quality Random Number Generation Using AES-CTR Mode with OpenSSL and AES-NI Support (PR #559)

Furthermore, I plan to adjust the default PRNG (Pseudo Random Number Generator) settings. The system will now check for AES-Ni support in the CPU; lacking that, it will default to XORoshiro256.

Can you hold back that change for the time being, I would like to hear back from interested parties as to how they feel about making that the default prng. Despite it's effectiveness as a random number generator we haven't performed relative speed tests on a range of different hardware yet.

It maybe that speed is more important in this application. It will be interesting to see how others feel about this. @martijnvanbrummelenhttps://github.com/martijnvanbrummelen @Firminatorhttps://github.com/Firminator @ggruberhttps://github.com/ggruber @mdcatohttps://github.com/mdcato etc.al

— Reply to this email directly, view it on GitHubhttps://github.com/martijnvanbrummelen/nwipe/pull/559#issuecomment-2047313170, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ANGK2PVG7BCPHKSTSMVEWH3Y4URBDAVCNFSM6AAAAABFFFDPTKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBXGMYTGMJXGA. You are receiving this because you were mentioned.Message ID: @.**@.>>

Knogle commented 4 months ago

Great point. I think extensive testing would be important for this case.

In case of my old Intel Xeon X5650, 1st gen with AES-Ni equipped from 2009, the performance is: AES-CTR-256 > XORoshiro-256. I'm curious about other architectures.

Implementing a simple check for AES-Ni support is rather simple, and set the PRNG in use accordingly.

PartialVolume commented 4 months ago

@knogle Can you go ahead and add the code for checking for AES-Ni support to the aes-ctr branch and also switching the default prng based on the result. It will then give me the chance to test the aes-ctr branch on some of my newer and older hardware. Thanks.

Knogle commented 4 months ago

@Knogle Can you go ahead and add the code for checking for AES-Ni support to the aes-ctr branch and also switching the default prng based on the result. It will then give me the chance to test the aes-ctr branch on some of my newer and older hardware. Thanks.

I've pushed the changes :)

Knogle commented 4 months ago

@PartialVolume

Can you maybe sum up, what's still necessary from your point of view, or what do you expect to be included into this PRNG until merge? Maybe i can implement further adjustments if appropriate. Regarding error handling, is there a way to terminate the program if a fatal error occurs? Cleanup doesn't seem to be necessary in this case because in this implementation, all vars and parameters for this PRNG are initialized with 0.

Knogle commented 4 months ago

I've included error handling now for OpenSSL library, as well as cleanup after nwipe_random_pass and nwipe_random_verify.

Knogle commented 4 months ago

Just a test on 1st AES-NI gen Westmere X5650 CPU, a nice result i think for a 15 years old CPU :) Screenshot from 2024-04-23 20-48-10

Knogle commented 3 months ago

I've created a fork of shredos and a testing release. I think it might be useful for testing the implementation.

https://github.com/Knogle/shredos.x86_64/releases/tag/aes-ctr

Knogle commented 3 months ago

Ahoy no issue :) Please check it out, did i get it done, what you have asked for?

PartialVolume commented 3 months ago

Ahoy no issue :) Please check it out, did i get it done, what you have asked for?

Yes, looks good. Did you merge the latest master code into your aes-ctr branch? As I want to download and test your branch with the updates in the master before I merge it.

Knogle commented 3 months ago

Hey, not yet, is there a easy way to do so?

PartialVolume commented 3 months ago

Hey, not yet, is there a easy way to do so?

Switch from your aes-ctr branch to your master, git checkout master. Then do a git pull upstream master then switch back to aes-ctr branch git checkout aes-ctr, then do git merge master.

Hopefully that should work, I don't do it often myself, but let me know if it works ok or not.

Knogle commented 3 months ago

Done! Thanks a lot :) I've always manually merged all changes lol, thanks for your advice.

PartialVolume commented 3 months ago

Done! Thanks a lot :) I've always manually merged all changes lol, thanks for your advice.

No problem, all looks good, I'll run some tests on that branch over the next week on various systems and if there are no issues I'll merge it into the master in a weeks time. Avoid changing anything in the branch in the meantime otherwise I might have to start over with the tests. Thanks again for your work on this.

Knogle commented 2 months ago

I hope you are doing fine :) Any news on this? I've recently used this algorithm in order to wipe and sell some drives.

After this, i will work on a new wiping method. Let's say, you have limited time in order to wipe a drive, but want to destroy as much data as possible. So i'd like to implement some sort of scattered PRNG, so this method will start writing random amounts of data, with a minimum and maximum size, on random locations. Every block will only be written once. I have encountered this situation a few times, i have some hard drive and i'd like to put it into my friend's PC or something, but can't wait 4h to wipe it completely, so i could wipe it a few minutes this way, and be sure, most of the data is unusable already.

Maybe you can give your opinion on this :)

PartialVolume commented 2 months ago

I hope you are doing fine :) Any news on this?

Hi, I'm crazy busy at the moment and probably will be so for the next month. I will get round to merging it though as soon as I can.

Re, the random prng, I'm not sure about this. Think of a scenario where a user that chooses this method assuming (always a big mistake) that all methods completely wipe the drive, however the one you propose only wipes out the whole of the directory structure and other random blocks of the disc.

It's pretty easy to scan that disc using various recovery tools to find where the start of the files are and then reconstruct those files, they may not be complete files but they might contain important information. If it only takes a few minutes to wipe, there is still going to be 99.9% of that disc recoverable using tools like testdisk.

I would think most people that use nwipe/ShredOS want a 100% guarantee or at least beyond any reasonable doubt that nothing exists on a wiped disc.

Using any of the existing methods for 5 minutes would totally destroy the partition table and then aborting would achieve much the same as you require. Of course if you friend is sneaky and knows that's what you did, he would just run testdisk to find all those files you never wiped. 😊

I would be interested in hearing what others think?

Knogle commented 2 months ago

Alright, I will reconsider this. :)

The "scattered PRNG" method should also ensure complete coverage of the entire disk drive upon 100% completion, similar to the default PRNG method. However, for SSDs, it might be more effective due to their wear-leveling algorithms to adopt a different approach. For example, write 1GB at the beginning of the disk, then write 2GB at the end, then 500MB in the middle, and continue filling the drive in this scattered manner, rather than writing sequentially.

I am currently conducting statistical analysis to determine how to destroy as much data as possible in the least amount of time. Ideally, by around 50% completion, most of the data should be irrecoverable.

1. Write a random amount of data at the beginning of the disk (e.g., 10%):
|xxxxxx-----------------------------------------| (10%)

2. Write a random amount of data at the end of the disk (e.g., 15%):
|xxxxxx-----------------------------------xxxxxx| (25%)

3. Write a random amount of data in the middle of the disk (e.g., 20%):
|xxxxxx------------xxxxxx-----------------xxxxxx| (45%)

4. Write random amounts of data in random areas until the disk is about 75% full:
|xxxxxx---xxxx---xxxxxx---xxxxxx--xxxx----xxxxxx| (75%)

5. Continue the process until the disk is completely filled (100%):
|xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx| (100%)

mdcato commented 2 months ago

The idea of PRNG writing to random locations at first sounds “incomplete”, however, it might not be any different than stopping one of the other methods before getting to 100%. If the user wants speed, and it gets harder as the drive capacities get larger, this might be reasonable. I would not like an incomplete wipe, but writing to scattered locations rather than beginning-to-end or end-to-beginning gives a little more safety.

And I’ll reiterate, as has been discussed before, SSDs need PRNG data for the entire drive else they just have all block pointers point to a single data block with the constant data (all zeros, all ones), allowing someone to later change a sector within a block and read the un-altered data in the block’s other sectors.

On calling external tools like hdparm: it’s a judgement call. Nwipe is utilizing the output of tools who have a different domain than nwipe (i.e., how to get drive-specific information) and those tools keep up with changes in their domain. To bring the code “inside” means nwipe has to constantly chase the changes these tool authors make. (Of course, they might change their output, but that’s probably less frequent than fixing bugs and adding support for new features. I would prefer to let the other tools do their research. I know there have been some boundary cases with customized firmware and low-volume production, but the external tools handle the wide view.)

From: PartialVolume @.> Sent: Saturday, June 15, 2024 08:30 To: martijnvanbrummelen/nwipe @.> Cc: Mike Cato / Hays Technical Services @.>; Mention @.> Subject: Re: [martijnvanbrummelen/nwipe] Implement High-Quality Random Number Generation Using AES-CTR Mode with OpenSSL and AES-NI Support (PR #559)

I hope you are doing fine :) Any news on this?

Hi, I'm crazy busy at the moment and probably will be so for the next month. I will get round to merging it though as soon as I can.

Re, the random prng, I'm not sure about this. Think of a scenario where a user that chooses this method assuming (always a big mistake) that all methods completely wipe the drive, however the one you propose only wipes out the whole of the directory structure and other random blocks of the disc.

It's pretty easy to scan that disc using various recovery tools to find where the start of the files are data and then reconstruct those files, they may not be complete files but they might contain important information. If it only takes a few minutes to wipe, there is still going to be 99.9% of that disc recoverable using tools like testdisk.

I would think most people that use nwipe/ShredOS want a 100% guarantee or at least beyond any reasonable doubt that nothing exists on a wiped disc.

Using any of the existing methods for 5 minutes would totally destroy the partition table and then aborting would achieve much the same as you require. Of course if you friend is sneaky and knows that's what you did, he would just run testdisk to find all those files you never wiped. 😊

I would be interested in hearing what others think?

— Reply to this email directly, view it on GitHubhttps://github.com/martijnvanbrummelen/nwipe/pull/559#issuecomment-2169613299, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ANGK2PXMOO6OALT6NMRKKZ3ZHQ6WXAVCNFSM6AAAAABFFFDPTKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRZGYYTGMRZHE. You are receiving this because you were mentioned.Message ID: @.**@.>>

ggruber commented 1 month ago

Alright, I will reconsider this. :)

The "scattered PRNG" method should also ensure complete coverage of the entire disk drive upon 100% completion, similar to the default PRNG method. However, for SSDs, it might be more effective due to their wear-leveling algorithms to adopt a different approach. For example, write 1GB at the beginning of the disk, then write 2GB at the end, then 500MB in the middle, and continue filling the drive in this scattered manner, rather than writing sequentially.

I am currently conducting statistical analysis to determine how to destroy as much data as possible in the least amount of time. Ideally, by around 50% completion, most of the data should be irrecoverable.

[...]

I see your idea of having not to wait for a complete wipe, but at the end for me the question is: why do I start a wipe? I want to avoid the sensitive information is leaving with an used drive. And as mentioned before, if you do not wipe the disk completely there are good tools to find the remains of files. So sensitive information could reappear. You could possibly make the life harder for such tools by traversing the filesystem and wiping the beginning of every single file. But the code effort to get this done seems fairly high to me. Every approach of randomly wiping blocks or areas does not give me a good feeling when giving the disk away.

Besides that, a whole wipe of a magnetic disks gives an impression about the physical fitness of the drive.

For rapid wiping (especially SSDs) I prefer the implemention of wiping with changing encryption keys as it is intended in many firmwares. Verification of the success is important but I'd think in quite a couple of cases we could have a faster wiping.

And I'd like to see this diskussion in another issue as it is off-topic to the original.

Just my opinion.

Knogle commented 1 month ago

I will open up a new PR/issue regarding this topic :)

PartialVolume commented 1 month ago

I will open up a new PR/issue regarding this topic :)

I would open a discussion about this feature rather than an issue. I'm not convinced it would work so well in practise. I would have thought the amount of time spend moving the read/write heads randomly all over the disc would drastically reduce throughput.

And as the goal of this method is to speed up wiping by making it look like the disc is wiped by wiping out the partition table amongst other things, but the whole disc isn't fully wiped I just don't see how it fits in with nwipe's primary purpose to erase the disc so beyond any reasonable doubt nothing can be recovered.

I understand your use case, I'm just not sure it's appropriate. Even If I was giving a disc to a friend, I would still wipe it fully as your friend might sell the computer/disc on sometimes in the future complete with the partially deleted contents of the disc.

Do any other wipe programs have a method like this?

mdcato commented 1 month ago

Agree. Full wipe needed.

-- Mike

From: PartialVolume @.> Sent: Saturday, June 29, 2024 6:43:49 PM To: martijnvanbrummelen/nwipe @.> Cc: Mike Cato / Hays Technical Services @.>; Mention @.> Subject: Re: [martijnvanbrummelen/nwipe] Implement High-Quality Random Number Generation Using AES-CTR Mode with OpenSSL and AES-NI Support (PR #559)

I will open up a new PR/issue regarding this topic :)

I would open a discussion about this feature rather than an issue. I'm not convinced it would work so well in practise. I would have thought the amount of time spend moving the read/write heads randomly all over the disc would drastically reduce throughput.

And as the goal of this method is to speed up wiping by making it look like the disc is wiped by wiping out the partition table amongst other things, but the whole disc isn't fully wiped I just don't see how it fits in with nwipe's primary purpose to erase the disc so beyond any reasonable doubt nothing can be recovered.

I understand your use case, I'm just not sure it's appropriate. Even If I was giving a disc to a friend, I would still wipe it fully as your friend might sell the computer/disc on sometimes in the future complete with the partially deleted contents of the disc.

Do any other wipe programs have a method like this?

— Reply to this email directly, view it on GitHubhttps://github.com/martijnvanbrummelen/nwipe/pull/559#issuecomment-2198374775, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ANGK2PSOK5ZWAMYV56IKCT3ZJ5BDLAVCNFSM6AAAAABFFFDPTKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOJYGM3TINZXGU. You are receiving this because you were mentioned.Message ID: @.***>

Knogle commented 1 month ago

Ahoy, minimum version set to OpenSSL 3.1, due to major improvements. AVX now also supported if the system is capable. Performance increase can be up to 50%.

Knogle commented 2 weeks ago

Ahoy, just a short question. Is it possible to upgrade the OpenSSL version of the build enviorment or something?

PartialVolume commented 2 weeks ago

Ahoy, just a short question. Is it possible to upgrade the OpenSSL version of the build enviorment or something?

The build environment as opposed to the version of Openssl?

What would be the purpose of upgrading the build environment? Isn't that something the Openssl guys would do.

Knogle commented 2 weeks ago

Oh i think there is a misunderstanding. I mean our OpenSSL version on our ci_ubuntu build process here Because the build has failed due to a lower OpenSSL version.

PartialVolume commented 2 weeks ago

Sorry, yes I understand now. I don't think its a case of upgrading the CI so it uses a newer version of openssl as that would mean it would fail compilation for every user not using that version.

The fix should really be in the code or maybe at compilation time.

It really needs a #ifdef so that if the openssl version is < 3.1 then 'x' code is compiled else 'y' code is compiled. This then allows nwipe to compile on distributions that use all versions of openssl.

However, my preference would be to not use a #ifdef but do a check at run time then run the appropriate code based on the version of openssl being used. I prefer this because I'm concerned that an upgrade to openssl would break the repository version of nwipe within a distribution. So doing the check at run time in the code would be my preference. We can then remove the minimum version requirement for openssl.

Either that or we check the openssl version at run time and if < 3.1 we disable the openssl PRNG. So it can't be used but nwipe still compiles and runs ok.

Knogle commented 2 weeks ago

Sorry, yes I understand now. I don't think its a case of upgrading the CI so it uses a newer version of openssl as that would mean it would fail compilation for every user not using that version.

The fix should really be in the code or maybe at compilation time.

It really needs a #ifdef so that if the openssl version is < 3.1 then 'x' code is compiled else 'y' code is compiled. This then allows nwipe to compile on distributions that use all versions of openssl.

However, my preference would be to not use a #ifdef but do a check at run time then run the appropriate code based on the version of openssl being used. I prefer this because I'm concerned that an upgrade to openssl would break the repository version of nwipe within a distribution. So doing the check at run time in the code would be my preference. We can then remove the minimum version requirement for openssl.

Either that or we check the openssl version at run time and if < 3.1 we disable the openssl PRNG. So it can't be used but nwipe still compiles and runs ok.

Yeah i can fix that, i've added a #error if the OpenSSL version is below 3.1, but i will revert this change then. There is no change in the code itself. But starting with OpenSSL v3.1, there are major optimizations in the OpenSSL lib itself, but even though, no change necessary in the code. Ahh, now compiled successfully.

Knogle commented 1 week ago

What's an easy way to build for i686? I'd like to check if this code is also affected by https://github.com/martijnvanbrummelen/nwipe/pull/579 , or at least a similar issue for 32-bit.

EDIT: Ok worked fine, only fibonacci and xoroshiro were affected.

PartialVolume commented 6 days ago

What's an easy way to build for i686? I'd like to check if this code is also affected by #579 , or at least a similar issue for 32-bit.

EDIT: Ok worked fine, only fibonacci and xoroshiro were affected.

Did you build for i686?

Knogle commented 6 days ago

I think so, i've created a 32-bit VM and was able to build the code there. But isn't it i386 then? Not sure tbh.

martijnvanbrummelen / nwipe