signalwire / libstirshaken

C library implementing STIR-shaken STI-SP AS/VS, STI-CA
MIT License
31 stars 23 forks source link

Is libstirshaken thread-safe? #137

Open a-rose opened 1 year ago

a-rose commented 1 year ago

Hello,

This lib is using CJSON through libks, and it seems like CJSON is not always thread-safe: https://github.com/DaveGamble/cJSON/issues/68

I can see some crashes happening when I stress-test libstirshaken, see below for a backtrace of an example when I start 100 threads to make test calls and generate a signature for each of them.

Maybe I'm doing things wrong? But if I use processes instead of threads there is no crash.

Thread 43 "media" received signal SIGSEGV, Segmentation fault.
__GI_abort () at abort.c:107
107 in abort.c
(gdb) bt
#0  __GI_abort () at abort.c:107
#1  0x00007f3764664648 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7f376476e2a0 "%s\n") at ../sysdeps/posix/libc_fatal.c:181
#2  0x00007f376466ad6a in malloc_printerr (str=str@entry=0x7f376476fc08 "munmap_chunk(): invalid pointer") at malloc.c:5359
#3  0x00007f376466b2e4 in munmap_chunk (p=<optimized out>) at malloc.c:2833
#4  0x00007f3745192812 in add_item_to_object (object=0x7f36d4014670, string=0x7f3760f238a2 "origid", item=0x7f36d401d6c0, hooks=0x7f37451d5c70 <global_hooks>, constant_key=0) at /usr/local/src/libks/src/include/cJSON/cJSON.c:1900
#5  0x00007f3745192b52 in cJSON_AddStringToObject (object=0x7f36d4014670, name=0x7f3760f238a2 "origid", string=0x7f371d432964 "a76cf207-14b4-4728-903c-e70acab4d763") at /usr/local/src/libks/src/include/cJSON/cJSON.c:2003
#6  0x00007f37451a17c6 in __ks_json_add_string_to_object (pool=0x0, file=0x7f3760f2383c "src/stir_shaken_passport.c", line=118, tag=0x7f3760f24410 <__PRETTY_FUNCTION__.31192> "stir_shaken_passport_jwt_init", object=0x7f36d4014670,
    name=0x7f3760f238a2 "origid", string=0x7f371d432964 "a76cf207-14b4-4728-903c-e70acab4d763") at /usr/local/src/libks/src/ks_json.c:348
#7  0x00007f3760f1504c in stir_shaken_passport_jwt_init (keylen=302,
    key=0x7f36d4025858 "<redacted>"...,
    params=<optimized out>, jwt=0x7f36d401d440, ss=0x7f36d40207a0) at src/stir_shaken_passport.c:118
#8  stir_shaken_passport_jwt_init (ss=0x7f36d40207a0, jwt=0x7f36d401d440, params=<optimized out>,
    key=0x7f36d4025858 "<redacted>"..., keylen=302)
    at src/stir_shaken_passport.c:35
#9  0x00007f3760f158cc in stir_shaken_passport_init (ss=0x7f36d40207a0, where=0x7f36d40086c0, params=<optimized out>, key=<optimized out>, keylen=<optimized out>) at src/stir_shaken_passport.c:321
#10 0x00007f3760f15a4f in stir_shaken_passport_create (ss=ss@entry=0x7f36d40207a0, params=0x7f36d4006390,
    key=0x7f36d4025858 "<redacted>"..., keylen=302)
    at src/stir_shaken_passport.c:354
#11 0x00007f3760f146da in stir_shaken_authenticate_to_sih_with_key (ss=0x7f36d40207a0, params=<optimized out>, passport_out=0x7f371d251718, key=<optimized out>, keylen=<optimized out>) at src/stir_shaken_service.c:768
crienzo commented 1 year ago

Try with latest version of libks. The "v1" version of it had some serious problems and we recently promoted the v2 branch to master.

a-rose commented 1 year ago

Hello,

Thanks for the suggestion @crienzo, upgrading to libks2 fixed some things, but now I'm struggling with more crashes memory leaks. It is easily reproducible with threads simply calling stir_shaken_init and stir_shaken_deinit, and I can observe the issue when I start around 5 threads.

Code:

#include <stdlib.h>
#include <stdio.h>
#include <pthread.h>
#include <stir_shaken.h>

void* thread(void *arg) {
    stir_shaken_context_t ctx;
    stir_shaken_status_t  status;
    stir_shaken_error_t   error_code = 0;
    pthread_t            *thread = (pthread_t *) arg;

    sleep(1); // wait for main to reach join()

    status = stir_shaken_init(&ctx, 6);
    if (status != STIR_SHAKEN_STATUS_OK) {
    printf("*** --- %p done (error)\n", (void*) thread);
        return NULL;
    }
    stir_shaken_deinit();
    printf("*** --- %p done\n", (void*) thread);
    return NULL;
}

int main(int argc, char **argv) {
    pthread_t *threads;
    int nb_threads = 10;

    // Get nb_threads from the command line
    if(argc > 1) {
        nb_threads = atoi(argv[1]);
    }
    threads = (pthread_t *) malloc(nb_threads * sizeof(pthread_t));
    printf("Starting %d threads\n", nb_threads);

    for(int i = 0; i < nb_threads; i++) {
        printf("*** start thread %p\n", (void*) &threads[i]);
        pthread_create(&threads[i], NULL, thread, &threads[i]);
    }

    for(int i = 0; i < nb_threads; i++) {
        printf("*** join thread %p\n", (void*) &threads[i]);
        pthread_join(threads[i], NULL);
    }

    free(threads);
    return 0;
}

Compiling: gcc -o test main.c -lstirshaken -pthread -I/usr/local/include/libks2

Output:

==8342== Memcheck, a memory error detector
==8342== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==8342== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info
==8342== Command: ./libringover-sti-example/ringover-sti-example 5
==8342==
Starting 5 threads
*** start thread 0x5ad4ad0
*** start thread 0x5ad4ad8
*** start thread 0x5ad4ae0
*** start thread 0x5ad4ae8
*** start thread 0x5ad4af0
*** join thread 0x5ad4ad0
Using (X9.62/SECG curve over a 256 bit prime field [415]) eliptic curve
Using TNAuthList extension with nid 1195
*** --- 0x5ad4ad0 done
Using (X9.62/SECG curve over a 256 bit prime field [415]) eliptic curve
*** join thread 0x5ad4ad8
Using TNAuthList extension with nid 1195
*** --- 0x5ad4ae0 done
Using (X9.62/SECG curve over a 256 bit prime field [415]) eliptic curve
Using TNAuthList extension with nid 1195
*** --- 0x5ad4ad8 done
Using (X9.62/SECG curve over a 256 bit prime field [415]) eliptic curve
Using TNAuthList extension with nid 1195
*** --- 0x5ad4ae8 done
*** join thread 0x5ad4ae0
*** join thread 0x5ad4ae8
*** join thread 0x5ad4af0
Using (X9.62/SECG curve over a 256 bit prime field [415]) eliptic curve
Using TNAuthList extension with nid 1195
*** --- 0x5ad4af0 done
==8342==
==8342== FILE DESCRIPTORS: 3 open at exit.
==8342== Open file descriptor 2: /dev/pts/5
==8342==    <inherited from parent>
==8342==
==8342== Open file descriptor 1: /dev/pts/5
==8342==    <inherited from parent>
==8342==
==8342== Open file descriptor 0: /dev/pts/5
==8342==    <inherited from parent>
==8342==
==8342==
==8342== HEAP SUMMARY:
==8342==     in use at exit: 35,760 bytes in 93 blocks
==8342==   total heap usage: 6,335 allocs, 6,242 frees, 449,990 bytes allocated
==8342==
==8342== 24 bytes in 3 blocks are indirectly lost in loss record 1 of 31
==8342==    at 0x483877F: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==8342==    by 0x4C99349: CRYPTO_zalloc (in /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1)
==8342==    by 0x4BFCA37: CTLOG_STORE_new (in /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1)
==8342==    by 0x4AAB09C: SSL_CTX_new (in /usr/lib/x86_64-linux-gnu/libssl.so.1.1)
==8342==    by 0x486A980: stir_shaken_init_ssl (stir_shaken_ssl.c:3048)
==8342==    by 0x485E1B5: stir_shaken_init_basic_and_ssl (stir_shaken.c:55)
==8342==    by 0x485E1B5: stir_shaken_init_basic_and_ssl (stir_shaken.c:45)
==8342==    by 0x1091F6: thread (in /usr/local/src/ringover-sti/libringover-sti-example/ringover-sti-example)
==8342==    by 0x4882EA6: start_thread (pthread_create.c:477)
==8342==    by 0x4998A2E: clone (clone.S:95)

+ many more leaks reported on the same malloc address

Is libstirshaken meant to be used differently?

a-rose commented 1 year ago

I managed to make it work by extracting all code out of stir_shaken_init_ssl and stir_shaken_deinit_ssl into my main function.