MacOS: fails benchmarks

mouse07410 commented 1 year ago

Apple Silicon M2 running MacOS Ventura 13.3.1, Xcode-14.3, Rust-1.69.0.

$ cargo bench --features "benchmarking kyber1024"        
  Downloaded tinytemplate v1.2.1
  Downloaded ciborium-io v0.2.0
.  .  .
   Compiling pqc_kyber v0.6.0 (/Users/ur20980/src/kyber-pure)
   Compiling clap v3.2.24
   Compiling lazy_static v1.4.0
   Compiling ppv-lite86 v0.2.17
   Compiling crossbeam-deque v0.8.3
   Compiling rand_chacha v0.3.1
   Compiling rand v0.8.5
   Compiling rayon v1.7.0
   Compiling plotters v0.3.4
   Compiling ciborium v0.2.0
   Compiling tinytemplate v1.2.1
   Compiling criterion v0.4.0
    Finished bench [optimized] target(s) in 12.63s
     Running unittests src/lib.rs (target/release/deps/pqc_kyber-795659d6ce51cfa5)

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Running benches/api.rs (target/release/deps/api-52af575b14e196b1)
Gnuplot not found, using plotters backend
Keypair Generation      time:   [30.546 µs 30.712 µs 30.904 µs]
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

Benchmarking Encapsulate: Warming up for 3.0000 sthread 'main' panicked at 'index out of bounds: the len is 0 but the index is 0', src/reference/fips202.rs:27:11
stack backtrace:
   0: rust_begin_unwind
             at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/panicking.rs:579:5
   1: core::panicking::panic_fmt
             at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/core/src/panicking.rs:64:14
   2: core::panicking::panic_bounds_check
             at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/core/src/panicking.rs:159:5
   3: pqc_kyber::reference::fips202::keccak_absorb_once
   4: pqc_kyber::reference::fips202::sha3_256
   5: pqc_kyber::kem::crypto_kem_enc
   6: criterion::bencher::Bencher<M>::iter
   7: <criterion::routine::Function<M,F,T> as criterion::routine::Routine<M,T>>::warm_up
   8: criterion::routine::Routine::sample
   9: criterion::analysis::common
  10: criterion::benchmark_group::BenchmarkGroup<M>::bench_function
  11: api::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

error: bench failed, to rerun pass `--bench api`

and

$ cargo bench --features "benchmarking kyber1024" --bench api 
    Finished bench [optimized] target(s) in 0.04s
     Running benches/api.rs (target/release/deps/api-52af575b14e196b1)
Keypair Generation      time:   [31.035 µs 31.092 µs 31.146 µs]
                        change: [-0.9414% -0.3577% +0.2071%] (p = 0.23 > 0.05)
                        No change in performance detected.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

Benchmarking Encapsulate: Warming up for 3.0000 sthread 'main' panicked at 'index out of bounds: the len is 0 but the index is 0', src/reference/fips202.rs:27:11
stack backtrace:
   0: rust_begin_unwind
             at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/panicking.rs:579:5
   1: core::panicking::panic_fmt
             at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/core/src/panicking.rs:64:14
   2: core::panicking::panic_bounds_check
             at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/core/src/panicking.rs:159:5
   3: pqc_kyber::reference::fips202::keccak_absorb_once
   4: pqc_kyber::reference::fips202::sha3_256
   5: pqc_kyber::kem::crypto_kem_enc
   6: criterion::bencher::Bencher<M>::iter
   7: <criterion::routine::Function<M,F,T> as criterion::routine::Routine<M,T>>::warm_up
   8: criterion::routine::Routine::sample
   9: criterion::analysis::common
  10: criterion::benchmark_group::BenchmarkGroup<M>::bench_function
  11: api::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

error: bench failed, to rerun pass `--bench api`

mberry commented 1 year ago

Have you resolved this?

Apple make it hard for anyone to troubleshoot issues such as this without their hardware. Have an MacOS emulation suite that can run these benchmarks, albeit very slowly, I can't replicate your issue emulated on x86.

Maybe someone else inside the walled garden can help you here. I'm open to donations of Apple Silicon VPS time for troubleshooting if you are willing.

mouse07410 commented 12 months ago

Have you resolved this?

No, I cannot. I hope that you could resolve it, with me running tests for you.

Maybe someone else inside the walled garden can help you here.

Well, I'm mine and your best bet here. I am "inside the walled garden", but I don't have a clue about what your code is doing, and what's the reason for your code to panic - i.e., what conditions it expected, failed to get satisfied:

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Running benches/api.rs (target/release/deps/api-b5c6e9d2b892f74a)
Keypair Generation      time:   [30.107 µs 30.175 µs 30.252 µs]
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

Benchmarking Encapsulate: Warming up for 3.0000 sthread 'main' panicked at 'index out of bounds: the len is 0 but the index is 0', src/reference/fips202.rs:27:11
stack backtrace:
   0: rust_begin_unwind
             at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/panicking.rs:578:5
   1: core::panicking::panic_fmt
             at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/core/src/panicking.rs:67:14
   2: core::panicking::panic_bounds_check
             at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/core/src/panicking.rs:162:5
   3: pqc_kyber::reference::fips202::keccak_absorb_once
   4: pqc_kyber::reference::fips202::sha3_256
   5: pqc_kyber::kem::crypto_kem_enc
   6: criterion::bencher::Bencher<M>::iter
   7: <criterion::routine::Function<M,F,T> as criterion::routine::Routine<M,T>>::warm_up
   8: criterion::routine::Routine::sample
   9: criterion::analysis::common
  10: criterion::benchmark_group::BenchmarkGroup<M>::bench_function
  11: api::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

error: bench failed, to rerun pass `--bench api`

I'm open to donations of Apple Silicon VPS time for troubleshooting if you are willing

I'm afraid you're mistaking me for the IBM Research VP. Or do you want me to donate you my Apple Silicon laptop (I'd still have to respectfully decline)?

In short. If you're willing to offer suggestions and co-debug with me (you telling what to do, me running code and reporting back), I'm willing to try them out and (hopefully) together fix this issue. If not - well, just mark this code as buggy on Apple Silicon for reasons currently unknown.

Update

The problem seems to be related to criterion or integration with it. Because tests all seem to succeed (unless they don't exercise a certain area that benchmarking does?).

$ cargo test --release
   Compiling pqc_kyber v0.6.0 (/Users/ur20980/src/kyber-pure)
    Finished release [optimized] target(s) in 1.42s
     Running unittests src/lib.rs (target/release/deps/pqc_kyber-a9dfb805acaf04f6)

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Running tests/kat.rs (target/release/deps/kat-3f77f7a7baad8518)

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Running tests/kem.rs (target/release/deps/kem-b721734812d9cbe4)

running 8 tests
test keypair_decap_ct_wrong_size ... ok
test keypair_decap_sk_wrong_size ... ok
test keypair_encap_pk_wrong_size ... ok
test keypair_failed_randombytes ... ok
test keypair_encap_failed_randombytes ... ok
test keypair_encap_decap ... ok
test public_from_private ... ok
test keypair_encap_decap_invalid_ciphertext ... ok

test result: ok. 8 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Running tests/kex.rs (target/release/deps/kex-41902445b4a8583c)

running 14 tests
test ake_failed_randombytes_client_init ... ok
test ake_invalid_client_init_ciphertext ... ok
test ake_uake_failed_randombytes_keypair ... ok
test ake_invalid_client_init_publickey ... ok
test ake_invalid_server_send_first_ciphertext ... ok
test ake_invalid_server_send_second_ciphertext ... ok
test uake_failed_randombytes_client_init ... ok
test uake_failed_randombytes_server_receive ... ok
test uake_invalid_client_init_ciphertext ... ok
test ake_valid ... ok
test uake_invalid_client_init_publickey ... ok
test ake_failed_randombytes_server_receive ... ok
test uake_valid ... ok
test uake_invalid_server_send_ciphertext ... ok

test result: ok. 14 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

   Doc-tests pqc_kyber

running 18 tests
test src/kex.rs - kex::Ake::client_init (line 225) ... ok
test src/kex.rs - kex::Ake::new (line 216) ... ok
test src/kex.rs - kex::Uake::client_init (line 98) ... ok
test src/api.rs - api::keypair (line 12) ... ok
test src/kex.rs - kex::Uake::client_confirm (line 144) ... ok
test src/api.rs - api::Keypair::generate (line 89) ... ok
test src/api.rs - api::encapsulate (line 32) ... ok
test src/kex.rs - kex::Uake (line 45) ... ok
test src/kex.rs - kex::Ake::server_receive (line 246) ... ok
test src/api.rs - api::decapsulate (line 56) ... ok
test src/kex.rs - kex::Ake::client_confirm (line 273) ... ok
test src/kex.rs - kex::Ake (line 170) ... ok
test src/kex.rs - kex::Uake::new (line 89) ... ok
test src/kex.rs - kex::Uake::server_receive (line 119) ... ok
test src/lib.rs - (line 34) ... ok
test src/lib.rs - (line 39) ... ok
test src/lib.rs - (line 59) ... ok
test src/lib.rs - (line 91) ... ok

test result: ok. 18 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 2.18s

mberry commented 12 months ago

Can you add the full backtrace and we'll go from there?

RUST_BACKTRACE=full cargo bench --features "benchmarking kyber1024"

mouse07410 commented 12 months ago

Can you add the full backtrace and we'll go from there?

RUST_BACKTRACE=full cargo bench --features "benchmarking kyber1024"

I'm afraid, not a whole lot:

$ RUST_MACKTRACE=full cargo bench --features "benchmarking kyber1024"
   Compiling pqc_kyber v0.6.0 (/Users/ur20980/src/kyber-pure)
warning: unused import: `std::println`
 --> benches/api.rs:1:39
  |
1 | #![cfg(feature = "benchmarking")] use std::println;
  |                                       ^^^^^^^^^^^^
  |
  = note: `#[warn(unused_imports)]` on by default

warning: `pqc_kyber` (bench "api") generated 1 warning (run `cargo fix --bench "api"` to apply 1 suggestion)
    Finished bench [optimized] target(s) in 1.44s
     Running unittests src/lib.rs (target/release/deps/pqc_kyber-9da025de157ed88e)

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Running benches/api.rs (target/release/deps/api-b5c6e9d2b892f74a)
Keypair Generation      time:   [30.240 µs 30.325 µs 30.422 µs]
                        change: [+0.8577% +1.5078% +2.2168%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 11 outliers among 100 measurements (11.00%)
  5 (5.00%) high mild
  6 (6.00%) high severe

Benchmarking Encapsulate: Warming up for 3.0000 sthread 'main' panicked at 'index out of bounds: the len is 0 but the index is 0', src/reference/fips202.rs:27:11
stack backtrace:
   0: rust_begin_unwind
             at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/panicking.rs:578:5
   1: core::panicking::panic_fmt
             at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/core/src/panicking.rs:67:14
   2: core::panicking::panic_bounds_check
             at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/core/src/panicking.rs:162:5
   3: pqc_kyber::reference::fips202::keccak_absorb_once
   4: pqc_kyber::reference::fips202::sha3_256
   5: pqc_kyber::kem::crypto_kem_enc
   6: criterion::bencher::Bencher<M>::iter
   7: <criterion::routine::Function<M,F,T> as criterion::routine::Routine<M,T>>::warm_up
   8: criterion::routine::Routine::sample
   9: criterion::analysis::common
  10: criterion::benchmark_group::BenchmarkGroup<M>::bench_function
  11: api::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

error: bench failed, to rerun pass `--bench api`
$

mberry commented 12 months ago

Hey so that's not the full backtrace, can you copy/paste the code in my comment and try again? There's a typo, you've got macktrace in there

mouse07410 commented 12 months ago

RUST_BACKTRACE=full cargo bench --features "benchmarking kyber1024"

Darn... Silly me... Here it goes:

$ RUST_BACKTRACE=full cargo bench --features "benchmarking kyber1024"
warning: unused import: `std::println`
 --> benches/api.rs:1:39
  |
1 | #![cfg(feature = "benchmarking")] use std::println;
  |                                       ^^^^^^^^^^^^
  |
  = note: `#[warn(unused_imports)]` on by default

warning: `pqc_kyber` (bench "api") generated 1 warning (run `cargo fix --bench "api"` to apply 1 suggestion)
    Finished bench [optimized] target(s) in 0.06s
     Running unittests src/lib.rs (target/release/deps/pqc_kyber-9da025de157ed88e)

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Running benches/api.rs (target/release/deps/api-b5c6e9d2b892f74a)
Keypair Generation      time:   [30.108 µs 30.194 µs 30.296 µs]
                        change: [-2.3143% -1.6634% -1.0566%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  4 (4.00%) high mild
  4 (4.00%) high severe

Benchmarking Encapsulate: Warming up for 3.0000 sthread 'main' panicked at 'index out of bounds: the len is 0 but the index is 0', src/reference/fips202.rs:27:11
stack backtrace:
   0:        0x102dd8af0 - std::backtrace_rs::backtrace::libunwind::trace::h0a647ce7e8dc2fab
                               at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
   1:        0x102dd8af0 - std::backtrace_rs::backtrace::trace_unsynchronized::hea920694a2a8ac80
                               at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:        0x102dd8af0 - std::sys_common::backtrace::_print_fmt::h7b4e20c1da2ebb61
                               at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/sys_common/backtrace.rs:65:5
   3:        0x102dd8af0 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h819e9cbdf1a9e730
                               at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/sys_common/backtrace.rs:44:22
   4:        0x102df500c - core::fmt::write::ha5e9bf3131ecb7c0
                               at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/core/src/fmt/mod.rs:1254:17
   5:        0x102dd5238 - std::io::Write::write_fmt::h414ce9994bf17404
                               at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/io/mod.rs:1698:15
   6:        0x102dd8904 - std::sys_common::backtrace::_print::h28d98f2094da6d1d
                               at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/sys_common/backtrace.rs:47:5
   7:        0x102dd8904 - std::sys_common::backtrace::print::h8072db0bbd5bcc3d
                               at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/sys_common/backtrace.rs:34:9
   8:        0x102dd9f34 - std::panicking::default_hook::{{closure}}::h2c85c5b0c2ede151
                               at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/panicking.rs:269:22
   9:        0x102dd9cf4 - std::panicking::default_hook::hcf2f70992d02f6fe
                               at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/panicking.rs:288:9
  10:        0x102dda40c - std::panicking::rust_panic_with_hook::h023af7f90b47eb8b
                               at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/panicking.rs:691:13
  11:        0x102dda340 - std::panicking::begin_panic_handler::{{closure}}::h14283519edc1d634
                               at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/panicking.rs:582:13
  12:        0x102dd8f10 - std::sys_common::backtrace::__rust_end_short_backtrace::hc366c0b0cef5b747
                               at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/sys_common/backtrace.rs:150:18
  13:        0x102dda0d4 - rust_begin_unwind
                               at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/panicking.rs:578:5
  14:        0x102e0e2f4 - core::panicking::panic_fmt::h324f50b29db90195
                               at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/core/src/panicking.rs:67:14
  15:        0x102e0e3f0 - core::panicking::panic_bounds_check::h44029950777bf20f
                               at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/core/src/panicking.rs:162:5
  16:        0x102c9f994 - pqc_kyber::reference::fips202::keccak_absorb_once::hfa9d2412d0d26918
  17:        0x102c9f09c - pqc_kyber::reference::fips202::sha3_256::h4800c0b5e166de04
  18:        0x102c83ffc - pqc_kyber::kem::crypto_kem_enc::h9105f52213e96543
  19:        0x102c80238 - criterion::bencher::Bencher<M>::iter::h9a21d2291970648c
  20:        0x102c76448 - <criterion::routine::Function<M,F,T> as criterion::routine::Routine<M,T>>::warm_up::h06ed0708b2add3df
  21:        0x102c78434 - criterion::routine::Routine::sample::hd0cc3a147cece1e2
  22:        0x102c7cd90 - criterion::analysis::common::hf8f90fd0142b16bc
  23:        0x102c9c464 - criterion::benchmark_group::BenchmarkGroup<M>::bench_function::h9c1ecdb2dd66f724
  24:        0x102c8e574 - api::main::h24ba327322e39238
  25:        0x102c8c27c - std::sys_common::backtrace::__rust_begin_short_backtrace::hb8f066ae8cca2bb1
  26:        0x102c8c424 - std::rt::lang_start::{{closure}}::hf60817bdacc7b46c
  27:        0x102dd1164 - core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once::h6f7eb9f266759f90
                               at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/core/src/ops/function.rs:287:13
  28:        0x102dd1164 - std::panicking::try::do_call::h54b2febb9ea02379
                               at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/panicking.rs:485:40
  29:        0x102dd1164 - std::panicking::try::h95a2f9f45aeb75ea
                               at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/panicking.rs:449:19
  30:        0x102dd1164 - std::panic::catch_unwind::h9686256fa0fc97a1
                               at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/panic.rs:140:14
  31:        0x102dd1164 - std::rt::lang_start_internal::{{closure}}::h227e8b10bc4e486b
                               at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/rt.rs:148:48
  32:        0x102dd1164 - std::panicking::try::do_call::h414d500a3ee5fa44
                               at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/panicking.rs:485:40
  33:        0x102dd1164 - std::panicking::try::h4f025820961f1c3f
                               at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/panicking.rs:449:19
  34:        0x102dd1164 - std::panic::catch_unwind::h0b71dfe3538d125d
                               at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/panic.rs:140:14
  35:        0x102dd1164 - std::rt::lang_start_internal::h8ee16b8f6c950a26
                               at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/rt.rs:148:20
  36:        0x102c8e8d8 - _main

error: bench failed, to rerun pass `--bench api`
$

mouse07410 commented 11 months ago

Ping?

mberry commented 11 months ago

I'll put aside some time to work on it this weekend.

mberry commented 11 months ago

Resolved in #93

mouse07410 commented 11 months ago

I confirm that this problem is fixed. Thanks!

Argyle-Software / kyber

MacOS: fails benchmarks #85

Update