randombit / botan

Cryptography Toolkit
https://botan.randombit.net
BSD 2-Clause "Simplified" License
2.58k stars 567 forks source link

pure virtual method called terminate called without an active exception #809

Closed computereasy closed 6 years ago

computereasy commented 7 years ago

Hello,

I am trying to run the test case "rsa_dec" provided by Botan (version 1.10.13), and got the following error:

./rsa_dec rsapriv.pem message.enc a

 pure virtual method called
terminate called without an active exception
Aborted

Note that I compiled the whole project on Ubuntu 12.04 64-bit with kernel version 3.8.0-44, and it works fine. However, when I re-run the compiled binary code on Ubuntu 12.04 64-bit with the kernel version 3.2.1, I would get the error above...

Since the rsa_dec is compiled into static linked binary code, I use the strace to printout the invoked system call sequence:

execve("./rsa_dec_vul_1_1", ["./rsa_dec_vul_1_1", "rsapriv.pem", "messagefile.enc", "a"], [/* 13 vars */]) = 0
uname({sys="Linux", node="localhost.localdomain", ...}) = 0
brk(0)                                  = 0x97b000
brk(0x97c1c0)                           = 0x97c1c0
arch_prctl(ARCH_SET_FS, 0x97b880)       = 0
set_tid_address(0x97bb50)               = 1627
set_robust_list(0x97bb60, 0x18)         = 0                                                                                                                        [32/1912]
futex(0x7fff2e539650, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, NULL, 97b880) = -1 EAGAIN (Resource temporarily unavailable)
rt_sigaction(SIGRTMIN, {0x54ff90, [], SA_RESTORER|SA_SIGINFO, 0x5535e0}, NULL, 8) = 0
rt_sigaction(SIGRT_1, {0x550020, [], SA_RESTORER|SA_RESTART|SA_SIGINFO, 0x5535e0}, NULL, 8) = 0
 rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0
getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM_INFINITY}) = 0
readlink("/proc/self/exe", "/gem5_test_set/rsa_botan_64_bit/rsa_dec_vul_1_1", 4096) = 47
brk(0x99d1c0)                           = 0x99d1c0
brk(0x99e000)                           = 0x99e000
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
futex(0x975cec, FUTEX_WAKE_PRIVATE, 2147483647) = 0
futex(0x975cf8, FUTEX_WAKE_PRIVATE, 2147483647) = 0
mlock(0x7fff2e537f00, 4096)             = 0
munlock(0x7fff2e537f00, 4096)           = 0
brk(0x9c8000)                           = 0x9c8000
mlock(0x997d80, 65536)                  = 0
open("/dev/urandom", O_RDONLY|O_NOCTTY|O_NONBLOCK) = 3
open("/dev/srandom", O_RDONLY|O_NOCTTY|O_NONBLOCK) = -1 ENOENT (No such file or   directory)
open("/dev/random", O_RDONLY|O_NOCTTY|O_NONBLOCK) = 4
clock_gettime(CLOCK_REALTIME, {1325376005, 310382356}) = 0
clock_gettime(CLOCK_MONOTONIC, {5, 310453976}) = 0
clock_gettime(0x4 /* CLOCK_??? */, {5, 310520016}) = 0
clock_gettime(0x2 /* CLOCK_??? */, {0, 9573130}) = 0
clock_gettime(0x3 /* CLOCK_??? */, {0, 9576155}) = 0
select(5, [3 4], NULL, NULL, {0, 32000}) = 2 (in [3 4], left {0, 31999})
read(3, "\335qe2\331r#\2\361\207b\320\271\"\5t\177\314\302\347\234\3657,\306\231y\t\362lx\220"..., 64) = 64
read(4, "b!\232E\37Z\305N", 64)         = 8
clock_gettime(CLOCK_REALTIME, {1325376005, 311039632}) = 0
clock_gettime(CLOCK_REALTIME, {1325376005, 311121287}) = 0
clock_gettime(CLOCK_REALTIME, {1325376005, 311223904}) = 0
clock_gettime(CLOCK_REALTIME, {1325376005, 311311116}) = 0
clock_gettime(CLOCK_REALTIME, {1325376005, 311398989}) = 0
clock_gettime(CLOCK_REALTIME, {1325376005, 311481387}) = 0
open("rsapriv.pem", O_RDONLY)           = 5
read(5, "-----BEGIN PRIVATE KEY-----\nMIIE"..., 8191) = 1704
lseek(5, 0, SEEK_SET)                   = 0
read(5, "-----BEGIN PRIVATE KEY-----\nMIIE"..., 8191) = 1704
write(2, "pure virtual method called\n", 27pure virtual method called) = 27
write(2, "terminate called without an acti"..., 45terminate called without an active exception) = 45
 rt_sigprocmask(SIG_UNBLOCK, [ABRT], NULL, 8) = 0
 tgkill(1627, 1627, SIGABRT)             = 0
 --- SIGABRT (Aborted) @ 0 (0) ---
 +++ killed by SIGABRT +++
Aborted

I am wondering if there is anyone have encountered the same issue like mine? Any suggestion would be strongly appreciated, thank you!

randombit commented 7 years ago

Sorry I have no idea why this error would be occurring. It is not something I have encountered before. But TBH 1.10 branch is 5+ years old and almost EOL, and almost zero development time has been spent on it for at least 3 years.

Since the factor that changes is the kernel version, perhaps Botan binary calls some syscall that the older kernel does not support. But this guess is not supported by the strace output... maybe a backtrace via gdb would provide additional information.

noloader commented 7 years ago

@randombit,

I'm seeing the issue on Master using a Pine64 dev-board after building with Clang 3.5.0 in both debug and release builds. The library was configured with --cc=clang --cc-abi="-march=armv8-a+crc+crypto -mtune=cortex-a53". The Makefile was modifed with sed -i 's|-O3|-g3 -O0|g' Makefile.

I'm guessing (and its just a guess) its a bad interaction with Clang, libc++ and either glibc or libstdc++. Or maybe bad code generation like @neverhub uncovered in Issue 515. Neither of @neverhub's work-arounds fixed the issue.

Here's the tail of the tests:

(gdb) r
Starting program: /home/jwalton/botan/botan-test
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".
Testing Botan 2.0.1 (unreleased, revision git:7eb382e4af1978841176326ed213a7a46e3cb2ad, distribution unspecified)
Starting tests rng:HMAC_DRBG with seed '149A676167836AC2'
AES-128 ran 3474 tests in 677.90 msec all ok
AES-192 ran 4050 tests in 809.25 msec all ok
AES-256 ran 4626 tests in 931.02 msec all ok
Blowfish ran 540 tests in 343.25 msec all ok
CAST-128 ran 369 tests in 68.87 msec all ok
CAST-256 ran 117 tests in 22.59 msec all ok
...
DLIES AES-256/CBC ran 84 tests in 32.23 sec all ok
DLIES AES-256/GCM ran 16 tests in 6.14 sec all ok
DLIES XOR ran 44 tests in 16.87 sec all ok
DLIES XOR ran 12 tests all ok
[New Thread 0x7fb752e1d0 (LWP 17186)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fb752e1d0 (LWP 17186)]
0x0000007fb782e04c in ?? () from /usr/lib/aarch64-linux-gnu/libstdc++.so.6

Here' the back trace. I'm not truncating them. Rather, gdb is not producing the full trace.

(gdb) where
#0  0x0000007fb75729e8 in __GI_raise (sig=sig@entry=0x6)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x0000007fb7573cf0 in __GI_abort () at abort.c:89
#2  0x0000007fb77da3fc in __gnu_cxx::__verbose_terminate_handler() ()
   from /usr/lib/aarch64-linux-gnu/libstdc++.so.6
#3  0x0000007fb77d84f4 in ?? () from /usr/lib/aarch64-linux-gnu/libstdc++.so.6
#4  0x0000007fb77d853c in std::terminate() ()
   from /usr/lib/aarch64-linux-gnu/libstdc++.so.6
#5  0x0000007fb77d9124 in __cxa_pure_virtual ()
   from /usr/lib/aarch64-linux-gnu/libstdc++.so.6
#6  0x0000007fb782dfd0 in ?? () from /usr/lib/aarch64-linux-gnu/libstdc++.so.6
#7  0x0000007fb7693e34 in start_thread (arg=0x7fb752e1d0)
    at pthread_create.c:311
#8  0x0000007fb760a4f0 in clone ()
    at ../ports/sysdeps/unix/sysv/linux/aarch64/nptl/../clone.S:96

And:

(gdb) bt full
#0  0x0000007fb75729e8 in __GI_raise (sig=sig@entry=0x6)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
        _sys_result = 0x0
        pd = 0x7fb752e1d0
        pid = <optimized out>
        selftid = 0x33c1
#1  0x0000007fb7573cf0 in __GI_abort () at abort.c:89
        save_stage = 0x2
        act = {__sigaction_handler = {sa_handler = 0x0, sa_sigaction = 0x0},
          sa_mask = {__val = {0x850, 0xf7b0, 0x4054390b56da2ce2, 0x0,
              0x3ff0001e0f265546, 0x0, 0x3fedfd5d4ec89b4d, 0x0,
              0x3ef0c6f7a0b5ed8d, 0x0, 0x3ff0000000000000, 0x0,
              0x3fedf1ab54000000, 0x0, 0xbe26022be36e1450, 0x0}},
          sa_flags = 0xb787b868,
          sa_restorer = 0x7fb77da3fc <__gnu_cxx::__verbose_terminate_handler()+236>}
        sigs = {__val = {0x20, 0x0 <repeats 15 times>}}
#2  0x0000007fb77da3fc in __gnu_cxx::__verbose_terminate_handler() ()
   from /usr/lib/aarch64-linux-gnu/libstdc++.so.6
No symbol table info available.
#3  0x0000007fb77d84f4 in ?? () from /usr/lib/aarch64-linux-gnu/libstdc++.so.6
No symbol table info available.
#4  0x0000007fb77d853c in std::terminate() ()
   from /usr/lib/aarch64-linux-gnu/libstdc++.so.6
No symbol table info available.
#5  0x0000007fb77d9124 in __cxa_pure_virtual ()
   from /usr/lib/aarch64-linux-gnu/libstdc++.so.6
No symbol table info available.
#6  0x0000007fb782dfd0 in ?? () from /usr/lib/aarch64-linux-gnu/libstdc++.so.6
No symbol table info available.
#7  0x0000007fb7693e34 in start_thread (arg=0x7fb752e1d0)
    at pthread_create.c:311
        pd = 0x7fb752e1d0
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {0x7fb752e1d0,
                0x7fffffbbf8, 0x7fb76b2000, 0x0, 0x7fb76b1000, 0x7fb752e290,
                0x7fb752dae0, 0x7fb7ff56f0, 0x800000, 0x7fb76b62a0,
                0x7fb752d9c0, 0x9a81fc62ea70644a, 0x0, 0x9a81fc62ea4b8382,
                0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0},
              mask_was_saved = 0x0}}, priv = {pad = {0x0, 0x0,
              0x7fb7693d84 <start_thread>, 0x7fb752e1d0}, data = {prev = 0x0,
              cleanup = 0x0, canceltype = 0xb7693d84}}}
        not_first_call = 0x0
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#8  0x0000007fb760a4f0 in clone ()
    at ../ports/sysdeps/unix/sysv/linux/aarch64/nptl/../clone.S:96
No locals.
(gdb)
computereasy commented 7 years ago

@randombit @noloader Hey there,

I know exactly how to reproduce this "pure virtual function" error and why it happens after a long time debugging early this month....

Let me update more information tomorrow.

noloader commented 7 years ago

@computereasy,

I know exactly how to reproduce [the issue]... Let me update more information tomorrow.

Yes, please do. If you have a reproducer, then I can ping a couple of the GCC devs and get them involved.

computereasy commented 7 years ago

@randombit @noloader

The issue I was trapped before is actually very tricky.

I was using a hardware simulator (gem5) to simulate the execution of the rsa_dec example code in Botan. Due to some reasons, the big integer multiplication of rsa becomes malfunctional in the simulation environment; I highly doubt that x86 instruction lock (e.g., lock mov eax, [ebx]) is buggy in gem5.

The above issue leads to incorrect results of big integer multiplication, for example, in RSA, the *p q != n** at this time. This issue would lead to the following checking failed (Botan version 1.10.13, the stable release):

IF_Scheme_PrivateKey::IF_Scheme_PrivateKey(RandomNumberGenerator& rng,
                                       const AlgorithmIdentifier&,
                                       const MemoryRegion<byte>& key_bits)
 {
   BER_Decoder(key_bits)
    .start_cons(SEQUENCE)
     .decode_and_check<size_t>(0, "Unknown PKCS #1 key format version")
     .decode(n)
     .decode(e)
     .decode(d)
     .decode(p)
     .decode(q)
     .decode(d1)
     .decode(d2)
     .decode(c)
  .end_cons();

     load_check(rng);       <----- this check is failed.
  }

And further leads to the failed of its call site:

  /*
* Run checks on a generated private key
*/
void Private_Key::gen_check(RandomNumberGenerator& rng) const
 {
  if(!check_key(rng, BOTAN_PRIVATE_KEY_STRONG_CHECKS_ON_GENERATE))
     throw Self_Test_Failure(algo_name() + " private key generation failed");     <----- go to this line
  }

Ideally, there should be an exception triggered at the throw Self_Test_Failure point; however, since the above function is inside a C++ class constructor and it calls a virtual function that is probably not initialized yet, such C++ failure pure virtual method called terminate called without an active exception is triggered...

randombit commented 7 years ago

Yes, certainly a bug in Botan 1.10 here. If invalid keys are loaded or generated, Botan potentially crashes instead of throwing an exception (depending on the compiler/runtime, probably). These mandatory self-tests were removed entirely in 2.0, so it does not affect new versions.

I don't see any real way to fix this in 1.10 without API changes, so I don't think this bug will be fixed in 1.10.

randombit commented 6 years ago

Closing since nothing further we can do, 1.10 is not changing and in 2.x the crash should not occur.