Closed TJ-91 closed 5 months ago
@TJ-91 Thanks for spotting this! This could be the reason of this discussion: https://github.com/pqc-thunderbird/rnp/pull/35#discussion_r1261076313
Probably valgrind do stricter checks than other CI tools. I'll try to reproduce it and introduce the required fixes.
The same (or a closely related) error can be seen when running the test case rnp_tests.test_ffi_encrypt_pk_with_v6_key
with valgrind.
I also noticed that running the test case rnp_tests.ecdh_decryptionNegativeCases
with Valgrind produces errors in combination with Botan3 but not Botan2 (Note: this is not related to our crypto refresh / PQC changes at all):
==11508== Invalid write of size 1
==11508== at 0x4AB5D24: Botan_FFI::ffi_guard_thunk(char const*, std::function<int ()> const&) (in /opt/botan3/lib/libbotan-3.so.0.1.0)
==11508== by 0x4AE1C4A: botan_rng_destroy (in /opt/botan3/lib/libbotan-3.so.0.1.0)
==11508== by 0x5E1F3C: rnp::RNG::~RNG() (rng.cpp:49)
==11508== by 0x5F415F: rnp::SecurityContext::~SecurityContext() (sec_profile.cpp:204)
==11508== by 0x531F494: __run_exit_handlers (exit.c:113)
==11508== by 0x531F60F: exit (exit.c:143)
==11508== by 0x5303D96: (below main) (libc_start_call_main.h:74)
==11508== Address 0x56dff90 is 0 bytes inside a block of size 35 free'd
==11508== at 0x484BB6F: operator delete(void*, unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==11508== by 0x14D83C: __gnu_cxx::new_allocator<char>::deallocate(char*, unsigned long) (new_allocator.h:145)
==11508== by 0x14A446: deallocate (allocator.h:199)
==11508== by 0x14A446: std::allocator_traits<std::allocator<char> >::deallocate(std::allocator<char>&, char*, unsigned long) (alloc_traits.h:496)
==11508== by 0x148DB5: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_destroy(unsigned long) (basic_string.h:245)
==11508== by 0x14771D: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_dispose() (basic_string.h:240)
==11508== by 0x146755: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string() (basic_string.h:672)
==11508== by 0x531FD9E: __call_tls_dtors (cxa_thread_atexit_impl.c:159)
==11508== by 0x531F5C8: __run_exit_handlers (exit.c:46)
==11508== by 0x531F60F: exit (exit.c:143)
==11508== by 0x5303D96: (below main) (libc_start_call_main.h:74)
==11508== Block was alloc'd at
==11508== at 0x4849013: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==11508== by 0x4AB59EC: Botan_FFI::ffi_error_exception_thrown(char const*, char const*, int) (in /opt/botan3/lib/libbotan-3.so.0.1.0)
==11508== by 0x49999C4: Botan_FFI::ffi_guard_thunk(char const*, std::function<int ()> const&) [clone .cold] (in /opt/botan3/lib/libbotan-3.so.0.1.0)
==11508== by 0x4AC6C92: botan_nist_kw_dec (in /opt/botan3/lib/libbotan-3.so.0.1.0)
==11508== by 0x5DAF16: ecdh_decrypt_pkcs5(unsigned char*, unsigned long*, pgp_ecdh_encrypted_t const*, pgp_ec_key_t const*, pgp_fingerprint_t const&) (ecdh.cpp:367)
==11508== by 0x1A88D8: rnp_tests_ecdh_decryptionNegativeCases_Test::TestBody() (cipher.cpp:364)
==11508== by 0x68D130: void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (gtest.cc:2612)
==11508== by 0x6860C0: void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (gtest.cc:2648)
==11508== by 0x660D8D: testing::Test::Run() (gtest.cc:2687)
==11508== by 0x6618A8: testing::TestInfo::Run() (gtest.cc:2836)
==11508== by 0x6622AE: testing::TestSuite::Run() (gtest.cc:3015)
==11508== by 0x6727D1: testing::internal::UnitTestImpl::RunAllTests() (gtest.cc:5920)
==11508==
==11508==
==11508== HEAP SUMMARY:
==11508== in use at exit: 0 bytes in 0 blocks
==11508== total heap usage: 10,664 allocs, 10,664 frees, 3,400,282 bytes allocated
==11508==
==11508== All heap blocks were freed -- no leaks are possible
==11508==
==11508== For lists of detected and suppressed errors, rerun with: -s
==11508== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
To me it looks like Botan3 in some way changes the memory management for the FFI calls and this could be the underlying issue for all the problems.
ecdh_decryptionNegativeCases
is probably not related.
I now also tried rnp_tests.test_ffi_aead_params
with valgrind and it also has the AEAD-related memory error for both Botan2 and Botan3. Adding crypt->aead.obj = nullptr;
to pgp_cipher_aead_destroy()
fixes it. I'm not confident to make that change, though, as I'm not very familiar with the AEAD code. Perhaps this is something you can look into and fix in the RNP main repo @ni4 ?
@TJ-91 Thanks for the information. File PR here: https://github.com/rnpgp/rnp/pull/2119 Actually, that old-style pgp_crypt_t should be refactored at some point.
to be verified still by MTG
Could this have the same source as this one: https://github.com/randombit/botan/issues/3926 ? Which is fixed via https://github.com/rnpgp/rnp/pull/2195/commits/007ccdbc2adf54ca18708c55b5e10d63edb2496f
It's possible. I can't reproduce the erorr with the current code so I assume it's fixed and I'm closing the issue.
Reproduce (assumes no keys in key store, otherwise have to specify the key when encrypting):
$ ./src/rnpkeys/rnpkeys -g --expert
./src/rnp/rnp -e cleartextfile
Results in a valgrind error, probably related to AEAD as used in SEIPDv2:
When not using AEAD but forcing SEIPDv1 (via
--v3-pkesk-only
flag), there is no error.