martijnvanbrummelen / nwipe

nwipe secure disk eraser
GNU General Public License v2.0
682 stars 77 forks source link

Implement High-Quality Random Number Generation Using AES-CTR Mode with OpenSSL and AES-NI Support #554

Closed Knogle closed 5 months ago

Knogle commented 5 months ago

Screenshot from 2024-03-10 16-28-43 Screenshot from 2024-03-10 16-13-32 In this pull request, I present my implementation of a pseudo-random number generator (PRNG) utilizing the AES-CTR (Advanced Encryption Standard - Counter mode) in 128-bit mode. This implementation is designed to produce high-quality random numbers, which are essential for a wide range of cryptographic applications. By integrating with the OpenSSL library and exploiting AES-NI (Advanced Encryption Standard New Instructions) hardware acceleration when available, I ensure both the security and efficiency of the random number generation process.

Key Features:

AES-CTR Mode: I chose AES in Counter mode due to its renowned capability to generate secure and unpredictable pseudo-random sequences. This mode operates by encrypting incrementing counter values, with the encryption output serving as the stream of random bytes.

128-bit AES: Utilizing a 128-bit key size for AES encryption provides a strong security measure while maintaining efficient performance, adhering to current cryptographic standards for pseudo-random number generation.

Integration with OpenSSL: OpenSSL, being a well-established and rigorously tested cryptographic library, is used to manage AES operations. This integration ensures a high level of security and performance for the AES-CTR operations within our PRNG.

Leveraging AES-NI Support: My implementation automatically detects and utilizes AES-NI, a set of instructions that enhance AES operations on most modern processors. This feature significantly improves the speed of random number generation, reducing CPU usage and enhancing scalability.

Implementation Details:

Initialization: At the outset, the PRNG's state is initialized with a distinct 128-bit key and an initial counter value, using OpenSSL's AES_set_encrypt_key to prepare the AES key structure for subsequent operations.

Generating Random Numbers: For generating random numbers, the current counter value is encrypted under the configured AES key in CTR mode. The output of this encryption serves as the source of pseudo-random bytes, with the counter incremented after each operation to maintain the uniqueness of subsequent inputs.

State Management: The PRNG's internal state, including the AES key, counter (IV), and encryption buffer (ecount), is securely managed within an aes_ctr_state_t structure. This careful management is crucial for preserving the integrity and unpredictability of the random number stream.

Optimizing for Hardware: By optimizing for AES-NI, my implementation ensures enhanced performance through hardware acceleration, providing an efficient solution for generating random numbers across various applications.

This PRNG implementation stands as a robust and efficient tool for generating high-quality pseudo-random numbers, crucial for cryptographic operations, secure communications, and randomized algorithms. The combination of AES-CTR mode, OpenSSL's reliability, and the performance benefits of AES-NI hardware acceleration results in a superior random number generator.

I have ensured that the implementation is well-documented with clear comments, making it accessible for review, understanding, and maintenance, following best practices in both software development and cryptographic standards.

I look forward to receiving feedback on this pull request to further improve and ensure the effectiveness of the PRNG implementation.

Test of randomness: 54e9585c-0218-4a40-be46-7911db900e0b

PartialVolume commented 5 months ago

@Knogle Hi, can you confirm I am running the latest version of AES, thanks.

Extract from prng.c

/* EXPERIMENTAL implementation of AES-128 in counter mode to provide high-quality random numbers */

int nwipe_aes_ctr_prng_init( NWIPE_PRNG_INIT_SIGNATURE )
{
    nwipe_log( NWIPE_LOG_NOTICE, "Initialising AES CTR PRNG" );

    if( *state == NULL )
    {
        /* This is the first time that we have been called. */
        *state = malloc( sizeof( aes_ctr_state_t ) );
    }
    aes_ctr_prng_init(
        (aes_ctr_state_t*) *state, (unsigned long*) ( seed->s ), seed->length / sizeof( unsigned long ) );

    return 0;
}

int nwipe_aes_ctr_prng_read( NWIPE_PRNG_READ_SIGNATURE )
{
    u8* restrict bufpos = buffer;
    size_t words = count / SIZE_OF_AES_CTR_PRNG;

    /* Loop to fill the buffer with 128-bit blocks directly */
    for( size_t ii = 0; ii < words; ++ii )
    {
        aes_ctr_prng_genrand_uint128_to_buf( (aes_ctr_state_t*) *state, bufpos );
        bufpos += 16;  // Move to the next block
    }

    /* Handle remaining bytes if count is not a multiple of SIZE_OF_AES_CTR_PRNG */
    const size_t remain = count % SIZE_OF_AES_CTR_PRNG;
    if( remain > 0 )
    {
        unsigned char temp_output[16];  // Temporary buffer for the last block
        aes_ctr_prng_genrand_uint128_to_buf( (aes_ctr_state_t*) *state, temp_output );
        // Copy the remaining bytes
        memcpy( bufpos, temp_output, remain );
    }

    return 0;  // Success
}

aes/aes_ctr_prng.c

/*
 * AES CTR PRNG Implementation
 * Author: Fabian Druschke
 * Date: 2024-03-13
 *
 * This is an AES (Advanced Encryption Standard) implementation in CTR (Counter) mode
 * for pseudorandom number generation, utilizing OpenSSL for cryptographic functions.
 *
 * As the author of this implementation, I, Fabian Druschke, hereby release this work into
 * the public domain. I dedicate any and all copyright interest in this work to the public
 * domain, making it free to use for anyone for any purpose without any conditions, unless
 * such conditions are required by law.
 *
 * This software is provided "as is", without warranty of any kind, express or implied,
 * including but not limited to the warranties of merchantability, fitness for a particular
 * purpose and noninfringement. In no event shall the authors be liable for any claim,
 * damages or other liability, whether in an action of contract, tort or otherwise, arising
 * from, out of or in connection with the software or the use or other dealings in the software.
 *
 * USAGE OF OPENSSL IN THIS SOFTWARE:
 * This software uses OpenSSL for cryptographic operations. Users are responsible for
 * ensuring compliance with OpenSSL's licensing terms.
 */

#include "aes_ctr_prng.h"
#include <openssl/rand.h>
#include <string.h>
#include <openssl/aes.h>
#include <openssl/modes.h>
#include <openssl/sha.h>  // Include for SHA256

void aes_ctr_prng_init( aes_ctr_state_t* state, unsigned long init_key[], unsigned long key_length )
{
    unsigned char key[32];  // Size for 256 bits

    SHA256_CTX sha256;
    SHA256_Init( &sha256 );
    SHA256_Update( &sha256, (unsigned char*) init_key, key_length * sizeof( unsigned long ) );  // Add the init_key

    // Optional: Add a salt to increase uniqueness
    // const unsigned char salt[] = "optional salt value";
    // SHA256_Update(&sha256, salt, sizeof(salt));

    SHA256_Final( key, &sha256 );  // Generate the final key

    AES_set_encrypt_key( key, 256, &state->aes_key );  // Use the 256-bit key
    memset( state->ivec, 0, AES_BLOCK_SIZE );  // Initialize the IV with zeros
    state->num = 0;
    memset( state->ecount, 0, AES_BLOCK_SIZE );
}

static void next_state( aes_ctr_state_t* state )
{
    for( int i = 0; i < AES_BLOCK_SIZE; ++i )
    {
        if( ++state->ivec[i] )
            break;
    }
}

void aes_ctr_prng_genrand_uint128_to_buf( aes_ctr_state_t* state, unsigned char* bufpos )
{
    // Initialize bufpos directly with random numbers, avoiding the use of a separate output buffer
    CRYPTO_ctr128_encrypt(
        bufpos, bufpos, 16, &state->aes_key, state->ivec, state->ecount, &state->num, (block128_f) AES_encrypt );

    // Ensure that next_state is called correctly without causing errors
    next_state( state );
}
PartialVolume commented 5 months ago

@Knogle Interesting so I setup the following loop test which passes with one round.

If you setup loop devices with the same or similar sizes as shown in the picture below, select verify all passes and importantly ten rounds, this will trigger the problem. You should be able to reproduce this on your system.

I believe it's possibly the initialisation of the seed that corrupts other calls being made to the openssl library. The more discs you have of differing sizes, the more likely it will trigger the thread issue. It's quite possible that if the drives are all the same size you won't see the problem. I'm pretty sure using openssl's thread locking functions will fix this and apparently according to the documentation may make it faster!

Screenshot_20240315_181625

PartialVolume commented 5 months ago

Have you considered not using openssl for their AES-CTR implementation but using your own AES code so you don't have to call any external libraries basically the same a mersenne, ISAAC and xoro. This would get round any thread issues and should actually be faster.

Knogle commented 5 months ago

Have you considered not using openssl for their AES-CTR implementation but using your own AES code so you don't have to call any external libraries basically the same a mersenne, ISAAC and xoro. This would get round any thread issues and should actually be faster.

Hey! I think this would be a solution to consider, i think i will try out some stuff and update the PR here!

Knogle commented 5 months ago

I've now created a version that includes debugging. So far, I've analyzed the key generation for initialization and verification. Every time, the keys match. I'll delve deeper, possibly examining buffer overflow or memory issues using Valgrind. I'd like to avoid, creating my own AES implementation, because i can't ensure it will be crypto-safe. generated_keys.txt

Knogle commented 5 months ago

Doing 2 rounds it already causes errors. Screenshot from 2024-03-19 20-58-55

Knogle commented 5 months ago

I was able to locate the issue.

The key on verification is always the same. So that's okay, but the Init-vector nor the ecount are determined by the seed. So it's not possible to reproduce those values for verification purposes. So i have to derive both the ecount and the init-vector as well as the key from the PRNG seed, provided by nwipe.c


void aes_ctr_prng_genrand_uint128_to_buf( aes_ctr_state_t* state, unsigned char* bufpos )
{
    // Initialize bufpos directly with random numbers, avoiding the use of a separate output buffer
    CRYPTO_ctr128_encrypt(
        bufpos, bufpos, 16, &state->aes_key, state->ivec, state->ecount, &state->num, (block128_f) AES_encrypt );

    // Ensure that next_state is called correctly without causing errors
    next_state( state );
}```
Knogle commented 5 months ago

I was able to get a successfull run, by manually setting ecount and ivec to 0. So i need to get sure, the seed, provided by nwipe.c always set's them to the same value, even during verify.

For debug purposes:

void aes_ctr_prng_genrand_uint128_to_buf( aes_ctr_state_t* state, unsigned char* bufpos )
{
    // Reset ivec and ecount to 0 at the beginning of each call to ensure
    // they are always initialized to 0 for each encryption operation
    memset( state->ivec, 0, AES_BLOCK_SIZE );
    memset( state->ecount, 0, AES_BLOCK_SIZE );
    state->num = 0;  // Ebenfalls zurücksetzen, um Konsistenz mit ecount und ivec zu gewährleisten

    // Initialize bufpos directly with random numbers, avoiding the use of a separate output buffer
    CRYPTO_ctr128_encrypt(
        bufpos, bufpos, 16, &state->aes_key, state->ivec, state->ecount, &state->num, (block128_f) AES_encrypt );

    // Ensure that next_state is called correctly without causing errors
    next_state( state );
}

It always creates the same data though.

Screenshot from 2024-03-20 14-20-52

Screenshot from 2024-03-20 14-41-07

Knogle commented 5 months ago

2 rounds and 2 verify, shouldnt look like that. As you see, same keys, but different num etc.

AES CTR State: IVec: 00000000703c4a528a7f000074ec6b53 Num: 32650 ECount: 903c4a528a7f00007efb6b538a7f0000 AES Key: [Nicht direkt zugänglich]

AES CTR State: IVec: 000000003000004c8a7f0000c0ffffff Num: 4294967295 ECount: 603d4a528a7f00000000000000000000 AES Key: [Nicht direkt zugänglich]

AES CTR State: IVec: 000000002022004c8a7f0000903e7501 Num: 0 ECount: 903c4a528a7f00007efb6b538a7f0000 AES Key: [Nicht direkt zugänglich]

AES CTR State: IVec: 000000003000004c8a7f0000c0ffffff Num: 4294967295 ECount: 603d4a528a7f00000000000000000000 AES Key: [Nicht direkt zugänglich]

PartialVolume commented 5 months ago

Also speed is not good < 1MB/sec on a 16 disc real world wipe as can be seen below. Switching to Issac gives me typical values up in the hundreds of Mbytes/sec.

https://github.com/martijnvanbrummelen/nwipe/assets/22084881/033ec997-d315-43d4-a480-65af6e84e706

Knogle commented 5 months ago

Ahhhh yes! I will try to provide something different maybe today. It's that slow,because currently it will print all the generated numbers into a text file, so it is very very slow currently. Only for debugging purposes.

Knogle commented 5 months ago

Please give it a try now. It's actually not secure in it's current state, but should perform without errors now! I think we are getting closer.

EDIT: It's very nasty. I was able to find out, ivec is the only issue. If i don't set ivec to something static every time, it fails, and leads to the IOERROR issue.

So memset(state->ivec, 3232321, AES_BLOCK_SIZE); // Setze den IV auf Nullen is works, but is trashy from a security and PRNG perspective, because the bit pattern repeats itself again and again.

In order to fix that, i unfortunately don't know yet. I think it will require a re-implementation of the structure around CRYPTO_ctr128_encrypt and the PRNG calls to AES in prng.c. In it's current state i'd rather drop it, and give it a try another time. Or at least i'll need some help from some folks here maybe. I'm out of ideas currently.

void aes_ctr_prng_genrand_uint128_to_buf(aes_ctr_state_t* state, unsigned char* bufpos) {
    // Setze ivec, ecount, und num zu Beginn jeder Operation zurück
    memset(state->ivec, 3232321, AES_BLOCK_SIZE);  // Setze den IV auf Nullen
    // memset(state->ecount, 0, AES_BLOCK_SIZE);  // Setze den ecount auf Nullen
    // state->num = 0;  // Setze num auf 0

    // Generiere direkt Zufallszahlen in bufpos, ohne einen separaten Ausgabepuffer zu verwenden
    CRYPTO_ctr128_encrypt(
        bufpos, bufpos, 16, &state->aes_key, state->ivec, state->ecount, &state->num, (block128_f)AES_encrypt);

    // Stelle sicher, dass next_state korrekt aufgerufen wird, ohne Fehler zu verursachen
    // next_state(state);
}
PartialVolume commented 5 months ago

I still think you might solve these problems using openssl's thread locking calls, as I understand it each AES prng running in each separate wipe thread is provided with an ID that is used in each callback to the openssl library, so openssl knows to restore the correct internal values to continue the correct stream for each separate thread.

Here's the latest code as of last night (where you commented out the call to the print function for debug)

Three problems:

  1. The obvious one, it is still failing verification when AES calls are made to openssl functions that are originating from multiple wipe threads simultaneously.
  2. The disk I/O speed is much too slow, AES is causing a 50-75% slow down in I/O compared to Issac or Xoro.
  3. As you pointed out earlier, it's not random. (snapshot below of what I'm seeing).

https://github.com/martijnvanbrummelen/nwipe/assets/22084881/4a4ec656-0d18-4c0a-bf66-634698dc0339

Screenshot_20240321_084846

https://www.openssl.org/docs/man1.0.2/man3/threads.html

OpenSSL can generally be used safely in multi-threaded applications provided that at least two callback functions are set, the locking_function and threadid_func. Note that OpenSSL is not completely thread-safe, and unfortunately not all global resources have the necessary locks. Further, the thread-safety does not extend to things like multiple threads using the same SSL object at the same time.

locking_function(int mode, int n, const char *file, int line) is needed to perform locking on shared data structures. (Note that OpenSSL uses a number of global data structures that will be implicitly shared whenever multiple threads use OpenSSL.) Multi-threaded applications will crash at random if it is not set.

locking_function() must be able to handle up to CRYPTO_num_locks() different mutex locks. It sets the n-th lock if mode & CRYPTO_LOCK, and releases it otherwise.

file and line are the file number of the function setting the lock. They can be useful for debugging.

threadid_func(CRYPTO_THREADID *id) is needed to record the currently-executing thread's identifier into id. The implementation of this callback should not fill in id directly, but should use CRYPTO_THREADID_set_numeric() if thread IDs are numeric, or CRYPTO_THREADID_set_pointer() if they are pointer-based. If the application does not register such a callback using CRYPTO_THREADID_set_callback(), then a default implementation is used - on Windows and BeOS this uses the system's default thread identifying APIs, and on all other platforms it uses the address of errno. The latter is satisfactory for thread-safety if and only if the platform has a thread-local error number facility.

Once threadid_func() is registered, or if the built-in default implementation is to be used;

CRYPTO_THREADID_current() records the currently-executing thread ID into the given id object.

CRYPTO_THREADID_cmp() compares two thread IDs (returning zero for equality, ie. the same semantics as memcmp()).

CRYPTO_THREADID_cpy() duplicates a thread ID value,

CRYPTO_THREADID_hash() returns a numeric value usable as a hash-table key. This is usually the exact numeric or pointer-based thread ID used internally, however this also handles the unusual case where pointers are larger than 'long' variables and the platform's thread IDs are pointer-based - in this case, mixing is done to attempt to produce a unique numeric value even though it is not as wide as the platform's true thread IDs.

Additionally, OpenSSL supports dynamic locks, and sometimes, some parts of OpenSSL need it for better performance. To enable this, the following is required:

Three additional callback function, dyn_create_function, dyn_lock_function and dyn_destroy_function.

A structure defined with the data that each lock needs to handle.

struct CRYPTO_dynlock_value has to be defined to contain whatever structure is needed to handle locks.

dyn_create_function(const char *file, int line) is needed to create a lock. Multi-threaded applications might crash at random if it is not set.

dyn_lock_function(int mode, CRYPTO_dynlock l, const char file, int line) is needed to perform locking off dynamic lock numbered n. Multi-threaded applications might crash at random if it is not set.

dyn_destroy_function(CRYPTO_dynlock l, const char file, int line) is needed to destroy the lock l. Multi-threaded applications might crash at random if it is not set.

CRYPTO_get_new_dynlockid() is used to create locks. It will call dyn_create_function for the actual creation.

CRYPTO_destroy_dynlockid() is used to destroy locks. It will call dyn_destroy_function for the actual destruction.

CRYPTO_lock() is used to lock and unlock the locks. mode is a bitfield describing what should be done with the lock. n is the number of the lock as returned from CRYPTO_get_new_dynlockid(). mode can be combined from the following values. These values are pairwise exclusive, with undefined behaviour if misused (for example, CRYPTO_READ and CRYPTO_WRITE should not be used together):

    CRYPTO_LOCK     0x01
    CRYPTO_UNLOCK   0x02
    CRYPTO_READ     0x04
    CRYPTO_WRITE    0x08

RETURN VALUES CRYPTO_num_locks() returns the required number of locks.

CRYPTO_get_new_dynlockid() returns the index to the newly created lock.

The other functions return no values.

NOTES You can find out if OpenSSL was configured with thread support:

define OPENSSL_THREAD_DEFINES

include <openssl/opensslconf.h>

if defined(OPENSSL_THREADS)

// thread support enabled

else

// no thread support

endif

Also, dynamic locks are currently not used internally by OpenSSL, but may do so in the future.

EXAMPLES crypto/threads/mttest.c shows examples of the callback functions on Solaris, Irix and Win32.

Knogle commented 5 months ago

Hey, thanks for reply! I have considered this as well, and tried thread locking as well, unfortunately without any change. The issue occurs already when wiping a single disk, doing 2 rounds, and verify all passes. According to the documentation in later OpenSSL versions thread locking is not necessary anymore https://github.com/openssl/openssl/issues/2165 thus OpenSSL 3.x implements thread locking by default without adding callbacks like in prior versions. Screenshot from 2024-03-13 08-45-10

Knogle commented 5 months ago

The structure for AES-CTR was derived from MT19937, i don't know if it's appropriate. I am currently looking for a way to somehow 'store' the PRNGs state. The current implementation still doesn't ensure, same seed = same data. Maybe some memory allocation or corruption issue? The ivec is always set to some obscure weird value. And currently i am using one uint seed in order to initialize ecount, ivec and the key itself. I don't know if it's possible to pass more random data from context.c to the PRNG in order to initialize all those values with an unique seed.

I'll post this issue on stackoverflow as well. I don't know how much i can get out of seed. From a technical pov i need 3 unique seeds for key, ivec and ecount and they have to match during verify.

aes_ctr_prng_init( (aes_ctr_state_t*) *state, (unsigned long*) ( seed->s ), seed->length / sizeof( unsigned long ) );

EDIT: I was able to implement a cryptographically secure PRNG using OpenSSL's SHA-256. Maybe its an alternative? I'll create a PR and we can do some testing maybe. It only requires a single seed like the other PRNGs. Hardware acceleration is also possible.

Knogle commented 5 months ago

Damn i've screwed up, pushed to master branch lol. Let's see if i can save the code still.

Knogle commented 5 months ago

I've updated the code now to use the EVP-API from OpenSSL. The performance is really high now, no need to use any threadlocking here. Still same issue with verify though.

Knogle commented 5 months ago

Now it's finally using AES-NI, doing around 700MB/s per thread on a Ryzen, and thread safe. A workaround might be, disable the verify all passes option for AES, and only verifying the last pass heh.

What is really driving me nuts:

1 ROUND + VERIFY LAST PASS is successful. 2 ROUND + VERIFY LAST PASS is successful. 2 ROUND + VERIFY ALL PASSES is fails.

Screenshot from 2024-03-21 12-29-42

Screenshot from 2024-03-21 12-31-39

At least i think, i have to understand seed further. But don't know where to look for. Whatever changes i make on seed, it leads to a fail on the first pass already, so i guess the issue is somewhere there.

aes_ctr_prng_init( (aes_ctr_state_t*) *state, (unsigned long*) ( seed->s ), seed->length / sizeof( unsigned long ) );

EDIT:

I have made changes to the nwipe_init function to set a static seed everytime, so there is no other choice than using this seed. But also verify errors here. Does the verification run, maybe initialize the PRNG in a different way? Do they call the same function for this purpose?

It would be nice to know, if verify and the writing itself both call nwipe_aes_ctr_prng_init in prng.c

int nwipe_aes_ctr_prng_init( NWIPE_PRNG_INIT_SIGNATURE )
{
    nwipe_log( NWIPE_LOG_NOTICE, "Initialising AES CTR PRNG" );

    if( *state == NULL )
    {
        /* This is the first time that we have been called. */
        *state = malloc( sizeof( aes_ctr_state_t ) );
        if (*state == NULL) {
            // Fehlerbehandlung für den Fall, dass malloc fehlschlägt
            nwipe_log(NWIPE_LOG_ERROR, "Failed to allocate memory for PRNG state.");
            return -1; // Oder ein anderer Fehlercode Ihrer Wahl
        }
    }

    // Statischer Seed-Wert für Debugging
    unsigned long static_seed[] = {0x12345678, 0x9ABCDEF0, 0x12345678, 0x9ABCDEF0}; // Beispiel-Seed
    size_t static_seed_length = sizeof(static_seed) / sizeof(unsigned long);

    aes_ctr_prng_init(
        (aes_ctr_state_t*) *state, static_seed, static_seed_length );

    return 0;
}

EDIT3:

Ok, static seed, always seems to set the same ivec, num count key. I have to debug the genrand function now. I will print all passes into a file and compare.

EDIT4:

All random numbers are printed into a random_data.bin file. 2 rounds, each one with verify, so 4 parts.

Using the static seed, part_aa - part_ad should be the same. All of them are the same, but the last one differs!! No idea why.

chairman@fedora:/tmp/nwipe$ diff part_aa part_ab chairman@fedora:/tmp/nwipe$ diff part_aa part_ac chairman@fedora:/tmp/nwipe$ diff part_aa part_ad Binary files part_aa and part_ad differ chairman@fedora:/tmp/nwipe$

Knogle commented 5 months ago

I think the entire issue is somehow memory related. I've printed all values that go into the genrand function, and they are all the same.

Also trying to call aes_ctr_prng_cleanup(aes_ctr_state_t* state) causes a segfault.


==40318== Conditional jump or move depends on uninitialised value(s)
==40318==    at 0x484E90E: bcmp (vg_replace_strmem.c:1229)
==40318==    by 0x410E0B: **nwipe_random_verify** (pass.c:198)
==40318==    by 0x415CE6: nwipe_runmethod (method.c:961)
==40318==    by 0x416C59: **nwipe_random** (method.c:742)
==40318==    by 0x4E98946: start_thread (pthread_create.c:444)
==40318==    by 0x4F1E873: clone (clone.S:100)
==40318== 
==40318== Conditional jump or move depends on uninitialised value(s)
==40318==    at 0x484E8E5: bcmp (vg_replace_strmem.c:1229)
==40318==    by 0x410E0B: nwipe_random_verify (pass.c:198)
==40318==    by 0x415CE6: nwipe_runmethod (method.c:961)
==40318==    by 0x416C59: nwipe_random (method.c:742)
==40318==    by 0x4E98946: start_thread (pthread_create.c:444)
==40318==    by 0x4F1E873: clone (clone.S:100)
==40318== 
==40318== Conditional jump or move depends on uninitialised value(s)
==40318==    at 0x410E0E: nwipe_random_verify (pass.c:198)
==40318==    by 0x415CE6: nwipe_runmethod (method.c:961)
==40318==    by 0x416C59: nwipe_random (method.c:742)
==40318==    by 0x4E98946: start_thread (pthread_create.c:444)
==40318==    by 0x4F1E873: clone (clone.S:100)
==40318== 
==40318== 
==40318== HEAP SUMMARY:
==40318==     in use at exit: 169,882 bytes in 523 blocks
==40318==   total heap usage: 798,528 allocs, 798,005 frees, 1,206,064,325 bytes allocated
==40318== 
==40318== LEAK SUMMARY:
==40318==    definitely lost: 583 bytes in 6 blocks
==40318==    indirectly lost: 9,819 bytes in 23 blocks
==40318==      possibly lost: 913 bytes in 12 blocks
==40318==    still reachable: 158,567 bytes in 482 blocks
==40318==         suppressed: 0 bytes in 0 blocks
==40318== Rerun with --leak-check=full to see details of leaked memory

EDIT4:

Now i see, the crypto function is trying to access memory that is not initialized yet.

==40641== Conditional jump or move depends on uninitialised value(s)
==40641==    at 0x4B9F565: CRYPTO_ctr128_encrypt_ctr32 (ctr128.c:183)
==40641==    by 0x4C655A1: ossl_cipher_hw_generic_ctr (ciphercommon_hw.c:117)
==40641==    by 0x4C60FCC: ossl_cipher_generic_stream_update (ciphercommon.c:469)
==40641==    by 0x4B61076: EVP_EncryptUpdate (evp_enc.c:643)
==40641==    by 0x414314: aes_ctr_prng_genrand_uint128_to_buf (aes_ctr_prng.c:133)
==40641==    by 0x417F5F: nwipe_aes_ctr_prng_read (prng.c:293)
==40641==    by 0x411104: nwipe_random_pass (pass.c:322)
==40641==    by 0x415B2C: nwipe_runmethod (method.c:934)
==40641==    by 0x416C59: nwipe_random (method.c:742)
==40641==    by 0x4E98946: start_thread (pthread_create.c:444)
==40641==    by 0x4F1E873: clone (clone.S:100)
Knogle commented 5 months ago

A little news. I am now quite confident, it's something with the memory. But what is weird, when i run the code now with valgrind, it's running slower, and i don't encounter any issues. If i run the same stuff without, in native speed, it causes IOERROR!

Unfortunately i have no clue in order to troubleshoot memory issues that deeply.

I've used calloc now instead of malloc and memset, and in valgrind the issue disappears, but in normal mode persists.

Screenshot from 2024-03-21 19-37-59

Screenshot from 2024-03-21 19-48-05