Open drewmacrae opened 1 year ago
This was working when it was a test_rom test by default, but didn't get marked as a test_rom test when aff3a6d0f164dab4f41aad35a233a1701b6b1ff8 enabled rom tests more generally.
This test should be marked as a default "short" test, and should target only the test_rom
~Bisecting finds commit 68d16cf7c32b9edbc19828b90aa2ea9ad4833506 (HEAD) Author: Eitan Shapira eitanshapira89@gmail.com Date: Thu Oct 20 17:30:55 2022 +0300~
~Small updates to flash_ctrl_rand_ops_base_vseq~
~But that seems wrong. Checking out the parent will not make it pass, so it's breaking, and not as hermetic as I'd like. I'm going to try rebooting to see if that helps, but we may be seeing some bitstream cache dependency here.~ (This is a red herring AFAIK)
This test is definitely flaky on the cw310 I've seen
commit bd9f6d0199af9e3358289c0b3a3900921afe8d02 (HEAD)
Author: Timothy Chen <timothytim@google.com>
Date: Fri Oct 21 11:10:12 2022 -0700
both pass and fail. I wonder if there's an interaction with the sam3 in this test, though that it fails in verilator indicates that's not the only problem.
Is it flaky in verilator?
Hmm, bisecting with verilator is showing some problems with:
[e5de5869cfd24ec6d23d19b2f75e89f7d9a61b7d] [test_rom] enable icache / Ibex security features
Makes sense that this would affect the number of cycles to time the operation. (the aes block is probably still accomplishing it in constant time, but the software that times it is probably accelerated by cache hits)
@alphan @jadephilipoom If the icache is causing things not to be constant time, should this test disable the icache? Should it prime the icache by running in a loop? What's OpenTitan's stance on the icache and constant time?
@alphan @jadephilipoom If the icache is causing things not to be constant time, should this test disable the icache? Should it prime the icache by running in a loop? What's OpenTitan's stance on the icache and constant time?
I think this is the first time we have encountered this issue in tests since we enabled icache
only recently. @jadephilipoom you've probably thought about this and was expecting this but I'm not sure if the cryptolib
has any policy on this yet. Obviously, it feels like disabling icache
is the safest option. cryptolib
should probably do this in a push/pop fashion, i.e. store current config, disable, restore orig config, in order not to affect the overall performance of the system.
I would expect the icache doesn't affect the block and that the constant-time property is supposed to be of-the-block, and that the test should have the icache off, so it can assess the timing of the block behavior better. So all our constant time tests should have it off.
#15832 asks the separate question about cryptolib.
My feeling is that the cryptolib should disable icache to protect against cache-timing attacks, and the push/pop method that @alphan mentioned seems appropriate. It's fine with me to disable the icache in this test for now; once I've properly connected the AES-GCM implementation to the top-level API, I'll disable it in the top-level code and won't need to specifically disable it for the test.
@msfschaffner Hey michael, is this task marked to M3 or TEST Triage
?
this should probably be M3. This isn't a functional issue per se, nor does there need to be a new test. More of a test / driver update (where appropriate) to disable the icache prior to running crypto commands on ibex.
Sorry, looks like, I didn't update the tags after performing the triage.
Since it was mentioned in SW WG today, summarizing from above: This is not a HW issue and will be fixed in cryptolib SW by disabling icache during the ibex-based part of AES-GCM (GHASH) processing. Can thus be postponed to M2.5.3 or M3 likely.
moved to M2.5.5 which is the “cryptolib” Milestone
Hierarchy of regression failure
Chip Level
Failure Description
Looks like something broke this test in master while it is lacking coverage in CI (it was incorrectly marked as long to save resources in CI)
I00002 aes_gcm_testutils.c:86] aes_gcm_decrypt took 19221 cycles^M^M I00003 aes_gcm_testutils.c:86] aes_gcm_decrypt took 19034 cycles^M^M I00004 aes_gcm_testutils.c:86] aes_gcm_decrypt took 19041 cycles^M^M E00005 aes_gcm_timing_test.c:44] CHECK-fail: AES-GCM decryption was not constant-time for different invalid tags^M^M I00006 status.c:34] FAIL!^M^M
Steps to Reproduce
bazel test //sw/device/tests/crypto:aes_gcm_timing_test
Tests with similar or related failures
No response