pqc-thunderbird / rnp

Manual clone of the repository https://github.com/rnpgp/rnp
Other
0 stars 0 forks source link

Valgrind Error when using AEAD Encryption via CLI #40

Closed TJ-91 closed 5 months ago

TJ-91 commented 1 year ago

Reproduce (assumes no keys in key store, otherwise have to specify the key when encrypting):

  1. Generate v6 Key (X25519 or PQC): $ ./src/rnpkeys/rnpkeys -g --expert
  2. encrypt with this key via ./src/rnp/rnp -e cleartextfile

Results in a valgrind error, probably related to AEAD as used in SEIPDv2:

==26109== Invalid read of size 4
==26109==    at 0x4AC2CCC: std::_Function_handler<int (), Botan_FFI::ffi_delete_object<Botan::Cipher_Mode, 3030564764u>(Botan_FFI::botan_struct<Botan::Cipher_Mode, 3030564764u>*, char const*)::{lambda()#1}>::_M_invoke(std::_Any_data const&) (in /opt/botan3/lib/libbotan-3.so.0.1.0)
==26109==    by 0x4AB5D33: Botan_FFI::ffi_guard_thunk(char const*, std::function<int ()> const&) (in /opt/botan3/lib/libbotan-3.so.0.1.0)
==26109==    by 0x4AC157A: botan_cipher_destroy (in /opt/botan3/lib/libbotan-3.so.0.1.0)
==26109==    by 0x201B5F: pgp_cipher_aead_destroy(pgp_crypt_t*) (symmetric.cpp:616)
==26109==    by 0x1D0207: encrypted_dst_close(pgp_dest_t*, bool) (stream-write.cpp:543)
==26109==    by 0x18232B: dst_close(pgp_dest_t*, bool) (stream-common.cpp:728)
==26109==    by 0x1D6693: rnp_encrypt_sign_src(pgp_write_handler_t*, pgp_source_t*, pgp_dest_t*) (stream-write.cpp:2091)
==26109==    by 0x2520AC: rnp_op_encrypt_execute (rnp.cpp:2872)
==26109==    by 0x13AC84: cli_rnp_encrypt_and_sign(rnp_cfg const&, cli_rnp_t*, rnp_input_st*, rnp_output_st*) (fficli.cpp:2869)
==26109==    by 0x13BA92: cli_rnp_protect_file(cli_rnp_t*) (fficli.cpp:2943)
==26109==    by 0x126F91: rnp_cmd(cli_rnp_t*) (rnp.cpp:261)
==26109==    by 0x12AE6D: main (rnp.cpp:707)
==26109==  Address 0x55fab58 is 8 bytes inside a block of size 56 free'd
==26109==    at 0x484BB6F: operator delete(void*, unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==26109==    by 0x4AC2CDE: std::_Function_handler<int (), Botan_FFI::ffi_delete_object<Botan::Cipher_Mode, 3030564764u>(Botan_FFI::botan_struct<Botan::Cipher_Mode, 3030564764u>*, char const*)::{lambda()#1}>::_M_invoke(std::_Any_data const&) (in /opt/botan3/lib/libbotan-3.so.0.1.0)
==26109==    by 0x4AB5D33: Botan_FFI::ffi_guard_thunk(char const*, std::function<int ()> const&) (in /opt/botan3/lib/libbotan-3.so.0.1.0)
==26109==    by 0x4AC157A: botan_cipher_destroy (in /opt/botan3/lib/libbotan-3.so.0.1.0)
==26109==    by 0x201B5F: pgp_cipher_aead_destroy(pgp_crypt_t*) (symmetric.cpp:616)
==26109==    by 0x1CFFE6: encrypted_dst_finish(pgp_dest_t*) (stream-write.cpp:507)
==26109==    by 0x1822B6: dst_finish(pgp_dest_t*) (stream-common.cpp:712)
==26109==    by 0x1D5CC0: process_stream_sequence(pgp_source_t*, pgp_dest_t*, unsigned int, pgp_dest_t*, pgp_dest_t*) (stream-write.cpp:1951)
==26109==    by 0x1D6634: rnp_encrypt_sign_src(pgp_write_handler_t*, pgp_source_t*, pgp_dest_t*) (stream-write.cpp:2088)
==26109==    by 0x2520AC: rnp_op_encrypt_execute (rnp.cpp:2872)
==26109==    by 0x13AC84: cli_rnp_encrypt_and_sign(rnp_cfg const&, cli_rnp_t*, rnp_input_st*, rnp_output_st*) (fficli.cpp:2869)
==26109==    by 0x13BA92: cli_rnp_protect_file(cli_rnp_t*) (fficli.cpp:2943)
==26109==  Block was alloc'd at
==26109==    at 0x4849013: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==26109==    by 0x4AC1201: std::_Function_handler<int (), botan_cipher_init::{lambda()#1}>::_M_invoke(std::_Any_data const&) (in /opt/botan3/lib/libbotan-3.so.0.1.0)
==26109==    by 0x4AB5D33: Botan_FFI::ffi_guard_thunk(char const*, std::function<int ()> const&) (in /opt/botan3/lib/libbotan-3.so.0.1.0)
==26109==    by 0x4AC14C4: botan_cipher_init (in /opt/botan3/lib/libbotan-3.so.0.1.0)
==26109==    by 0x201265: pgp_cipher_aead_init(pgp_crypt_t*, pgp_symm_alg_t, pgp_aead_alg_t, unsigned char const*, bool) (symmetric.cpp:477)
==26109==    by 0x1D1D17: encrypted_start_aead(pgp_dest_encrypted_param_t*, unsigned char*) (stream-write.cpp:998)
==26109==    by 0x1D26D2: init_encrypted_dst(pgp_write_handler_t*, pgp_dest_t*, pgp_dest_t*) (stream-write.cpp:1143)
==26109==    by 0x1D63FF: rnp_encrypt_sign_src(pgp_write_handler_t*, pgp_source_t*, pgp_dest_t*) (stream-write.cpp:2057)
==26109==    by 0x2520AC: rnp_op_encrypt_execute (rnp.cpp:2872)
==26109==    by 0x13AC84: cli_rnp_encrypt_and_sign(rnp_cfg const&, cli_rnp_t*, rnp_input_st*, rnp_output_st*) (fficli.cpp:2869)
==26109==    by 0x13BA92: cli_rnp_protect_file(cli_rnp_t*) (fficli.cpp:2943)
==26109==    by 0x126F91: rnp_cmd(cli_rnp_t*) (rnp.cpp:261)

When not using AEAD but forcing SEIPDv1 (via --v3-pkesk-only flag), there is no error.

ni4 commented 1 year ago

@TJ-91 Thanks for spotting this! This could be the reason of this discussion: https://github.com/pqc-thunderbird/rnp/pull/35#discussion_r1261076313

Probably valgrind do stricter checks than other CI tools. I'll try to reproduce it and introduce the required fixes.

TJ-91 commented 1 year ago

The same (or a closely related) error can be seen when running the test case rnp_tests.test_ffi_encrypt_pk_with_v6_key with valgrind.

I also noticed that running the test case rnp_tests.ecdh_decryptionNegativeCases with Valgrind produces errors in combination with Botan3 but not Botan2 (Note: this is not related to our crypto refresh / PQC changes at all):

==11508== Invalid write of size 1
==11508==    at 0x4AB5D24: Botan_FFI::ffi_guard_thunk(char const*, std::function<int ()> const&) (in /opt/botan3/lib/libbotan-3.so.0.1.0)
==11508==    by 0x4AE1C4A: botan_rng_destroy (in /opt/botan3/lib/libbotan-3.so.0.1.0)
==11508==    by 0x5E1F3C: rnp::RNG::~RNG() (rng.cpp:49)
==11508==    by 0x5F415F: rnp::SecurityContext::~SecurityContext() (sec_profile.cpp:204)
==11508==    by 0x531F494: __run_exit_handlers (exit.c:113)
==11508==    by 0x531F60F: exit (exit.c:143)
==11508==    by 0x5303D96: (below main) (libc_start_call_main.h:74)
==11508==  Address 0x56dff90 is 0 bytes inside a block of size 35 free'd
==11508==    at 0x484BB6F: operator delete(void*, unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==11508==    by 0x14D83C: __gnu_cxx::new_allocator<char>::deallocate(char*, unsigned long) (new_allocator.h:145)
==11508==    by 0x14A446: deallocate (allocator.h:199)
==11508==    by 0x14A446: std::allocator_traits<std::allocator<char> >::deallocate(std::allocator<char>&, char*, unsigned long) (alloc_traits.h:496)
==11508==    by 0x148DB5: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_destroy(unsigned long) (basic_string.h:245)
==11508==    by 0x14771D: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_dispose() (basic_string.h:240)
==11508==    by 0x146755: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string() (basic_string.h:672)
==11508==    by 0x531FD9E: __call_tls_dtors (cxa_thread_atexit_impl.c:159)
==11508==    by 0x531F5C8: __run_exit_handlers (exit.c:46)
==11508==    by 0x531F60F: exit (exit.c:143)
==11508==    by 0x5303D96: (below main) (libc_start_call_main.h:74)
==11508==  Block was alloc'd at
==11508==    at 0x4849013: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==11508==    by 0x4AB59EC: Botan_FFI::ffi_error_exception_thrown(char const*, char const*, int) (in /opt/botan3/lib/libbotan-3.so.0.1.0)
==11508==    by 0x49999C4: Botan_FFI::ffi_guard_thunk(char const*, std::function<int ()> const&) [clone .cold] (in /opt/botan3/lib/libbotan-3.so.0.1.0)
==11508==    by 0x4AC6C92: botan_nist_kw_dec (in /opt/botan3/lib/libbotan-3.so.0.1.0)
==11508==    by 0x5DAF16: ecdh_decrypt_pkcs5(unsigned char*, unsigned long*, pgp_ecdh_encrypted_t const*, pgp_ec_key_t const*, pgp_fingerprint_t const&) (ecdh.cpp:367)
==11508==    by 0x1A88D8: rnp_tests_ecdh_decryptionNegativeCases_Test::TestBody() (cipher.cpp:364)
==11508==    by 0x68D130: void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (gtest.cc:2612)
==11508==    by 0x6860C0: void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (gtest.cc:2648)
==11508==    by 0x660D8D: testing::Test::Run() (gtest.cc:2687)
==11508==    by 0x6618A8: testing::TestInfo::Run() (gtest.cc:2836)
==11508==    by 0x6622AE: testing::TestSuite::Run() (gtest.cc:3015)
==11508==    by 0x6727D1: testing::internal::UnitTestImpl::RunAllTests() (gtest.cc:5920)
==11508== 
==11508== 
==11508== HEAP SUMMARY:
==11508==     in use at exit: 0 bytes in 0 blocks
==11508==   total heap usage: 10,664 allocs, 10,664 frees, 3,400,282 bytes allocated
==11508== 
==11508== All heap blocks were freed -- no leaks are possible
==11508== 
==11508== For lists of detected and suppressed errors, rerun with: -s
==11508== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

To me it looks like Botan3 in some way changes the memory management for the FFI calls and this could be the underlying issue for all the problems.

TJ-91 commented 1 year ago

ecdh_decryptionNegativeCases is probably not related.

I now also tried rnp_tests.test_ffi_aead_params with valgrind and it also has the AEAD-related memory error for both Botan2 and Botan3. Adding crypt->aead.obj = nullptr; to pgp_cipher_aead_destroy() fixes it. I'm not confident to make that change, though, as I'm not very familiar with the AEAD code. Perhaps this is something you can look into and fix in the RNP main repo @ni4 ?

ni4 commented 1 year ago

@TJ-91 Thanks for the information. File PR here: https://github.com/rnpgp/rnp/pull/2119 Actually, that old-style pgp_crypt_t should be refactored at some point.

falko-strenzke commented 5 months ago

to be verified still by MTG

ni4 commented 5 months ago

Could this have the same source as this one: https://github.com/randombit/botan/issues/3926 ? Which is fixed via https://github.com/rnpgp/rnp/pull/2195/commits/007ccdbc2adf54ca18708c55b5e10d63edb2496f

TJ-91 commented 5 months ago

It's possible. I can't reproduce the erorr with the current code so I assume it's fixed and I'm closing the issue.