lowRISC / opentitan

OpenTitan: Open source silicon root of trust
https://www.opentitan.org
Apache License 2.0
2.58k stars 777 forks source link

[entropy_src] Need support for testing health tests #22289

Closed vsukhoml closed 6 months ago

vsukhoml commented 7 months ago

Description

Part of FIPS certification would include an operational testing to ensure that health tests for entropy works. This is usually done by simulation fault at entropy source (e.g., all zeroes, or specific pattern simulating its fault mode). FW Override mode currently implement happens only after health checks. One solution would be to have similar feature before health tests, so it would be possible for firmware to demonstrate that health checks work for the evaluation lab.

This feature is also useful for NIST entropy assessment, where raw data should be used to estimate raw entropy and configure health check's thresholds appropriately. @vogelpi , @johannheyszl

vsukhoml commented 7 months ago

Related issue #20953

vogelpi commented 7 months ago

@vsukhoml , I am reworking the ENTROPY_SRC documentation to clarify some of these aspects. I'll ping you in the PR for feedback.

In general, the health tests don't alter the entropy. Whether you extract the entropy in firmware override mode before or after the health tests makes no difference.

vogelpi commented 7 months ago

As for testing the health tests themselves: this is the job of DV. There is a software model of the health tests in DV that gets compared to the hardware health test outputs in the scoreboard. If there are mismatches, DV fails.

vsukhoml commented 7 months ago

The question is how would you use DV tests to convince the evaluation lab that your firmware can rely on it. Lab will do operational testing, asking to simulate failures in health checks by feeding simulated bad entropy. They do video recording of the process in case of questions, audits. Practically I don't know how to use DV data for that. Is any lab involved on this stage to collect data confirming correct operations of health checks? If not, it will be hard to use presence of DV tests to validate module and end up in relying on firmware tests if FIPS cert is needed. So, depends on desired claims for OT.

Given raw access to entropy (unchanged, unfiltered), we can have firmware workaround, so not critical, just performance implications, but then why do we have all this HW?

johannheyszl commented 7 months ago

cc @zi-v pls to have a look too

vsukhoml commented 7 months ago

FIPS 140-3 IG on page 149, Health tests states:

The tester shall verify that all the vendor-identified known or suspected noise source failure modes are detected by the continuous health tests included within the entropy source. (See SP 800-90B Section 4.3, Requirements #1, #7, #8 and #9).

We need to be able to provide means for this.

zi-v commented 7 months ago

Since the analog samples can be extracted via the observe FIFO, the health test models can run offline if needed. However, I do agree that binding everything to HW has both cost and risk. Today we feel we can pass FIPS with the currently implemented health tests, but it might be that in the future FIPS/lab may decide that we need to augment with more tests. Maybe we can consider the SW bypass as the main route and maybe think of taking out some of the hardware. Also, from our experience, FIPS grade entropy is not massively needed and can be managed by keeping a buffer and generate bit chunks upon need using startup health test each time.

vsukhoml commented 7 months ago

We need to demonstrate to the lab that HW health checks do work. How to do this during FIPS certification of firmware? It is combination of process and technical issues. Can we prepare evidence as part of DV testing? I don't know.

johannheyszl commented 7 months ago

Let me try summarize:

My guess is:

cc @moidx

vogelpi commented 7 months ago

Thanks for your summary @johannheyszl . Yes, I believe it's a big advantage of the open source project to easily share the RTL source code (including DV environment and DV model) such that labs can inspect and convince themselves that it is correct.

If absolutely needed, we could add a mux close to the RNG input and a new register for software to inject 4-bit values there. We would harden the configuration setting and maybe if enabled prevent the ENTROPY_SRC from outputting anything to downstream consumers / CSRNG on the hardware interface. Because this is a security risk. However, we need to be 100% this is absolutely needed. Because it means additional work and we have an extremely tight schedule. Any additional work not already captured in the milestones puts this at risk.

moidx commented 7 months ago

I agree with @zi-v in that FIPS mode entropy is not massively needed at runtime.

We conflated the need to have some health checks in the hardware entropy pipeline with the definition of FIPS entropy. We now have low confidence in being able to maintain FIPS certification over time without requiring SW bypass (FW OVR) mode due to evolving certification requirements and flagged risks during integration testing.

So far our experience has been with certifying software based health checks and so have no prior experience with this approach. I obtained feedback from someone more familiar with FIPS certification in hardware and their feedback was pointing towards implementing test interfaces as requested in this issue. This is because the lab may want to ensure that the hardware implementation is implemented as defined in the RTL or model. This also depends on the physical boundary definition of the FIPS module, which for us is very likely to be the IC package, as well as the targeted certification level.

@vogelpi what is your intuition in terms of complexity and area?

During issue triage we can provide feedback on whether or not we are able to absorb this change based on other priorities.

vogelpi commented 7 months ago

Thanks for your feedback @moidx . Implementation complexity is low and area as well. It's in the order of 8 FFs and 4 2-to-1 MUXes.

Compared to all the other things that were raised, this one hear is probably the most feasible one to implement (and verify). And I see that decisions have been taken to not implement the other feature requests. Under these conditions, I think we should be able to absorb this change here.

johannheyszl commented 7 months ago

Note to add: NIST provides docs on entropy validation of which one is a sheet listing 90B Shall Statements. Rows 76, 119, 124, 125 seem relevant, but not specifically conclusive to answer the question.

vogelpi commented 7 months ago

Waiting for feedback from people with hardware FIPS experience.

vogelpi commented 7 months ago

I now got feedback from someone having experience with this. It doesn't seem necessary to have an option for doing KATs of the health tests for validation testing.

vsukhoml commented 7 months ago

@vogelpi what is needed then to do operational testing to demonstrate working health tests? it is not called a KAT, to avoid confusion. It is a proof for the lab that tests do work. It can be done with custom firmware, but should be same implementation of tests. I'd check with the lab on what is enough.

vogelpi commented 7 months ago

I don't know how it's actually done in practice. But the feedback I got is that entropy source blocks usually don't have an interface to push in known entropy before the health tests. If you can check with the lab on what is needed that would be great thanks.

johngt commented 7 months ago

@vsukhoml / @moidx - just chasing up on @vogelpi behalf. It would be good to confirm whether this is needed and if so what specifically.

vsukhoml commented 7 months ago

@johngt , let me try to clarify step by step.

FIPS 140-3 literally says: "The tester shall verify that all the vendor-identified known or suspected noise source failure modes are detected by the continuous health tests included within the entropy source. (See SP 800-90B Section 4.3, Requirements 1, 7, 8 and 9)."

From SP 800-90B, Section 4.3:

  1. The submitter shall provide documentation that specifies all entropy source health tests and their rationale. The documentation shall include a description of the health tests, source code, the rate and conditions under which each health test is performed (e.g., at power-up, continuously, or on-demand), and include rationale indicating why each test is believed to be appropriate for detecting one or more failures in the noise source.
  2. The submitter shall provide documentation of any known or suspected noise source failure modes (e.g., the noise source starts producing periodic outputs like 101…01), and shall include developer-defined continuous tests to detect those failures. These should include potential failure modes that might be caused by an attack on the device.
  3. Appropriate health tests that are tailored to the noise source should place special emphasis on the detection of misbehavior near the boundary between the nominal operating environment and abnormal conditions. This requires a thorough understanding of the operation of the noise source

Entropy Validation Documents part lists some templates used by the lab, specifically Entropy Assessment Report Template v1.1 which is what lab have to fill and submit to NIST. Below is the table with some related requirements.

ID SP 800-90B Section and Location Statement Requirement Status
47 §3.2.3 Requirement 2 If the entropy source uses a vetted conditioning component as listed in Section 3.1.5.1.1, the implementation of the component shall be tested to obtain assurance of correctness before subsequent testing of the entropy source. Required
55 §3.2.4 Requirement 2 Data collected from the noise source for validation testing shall be raw output values Required
58 §3.2.4 Requirement 5 Data shall be collected from the entropy source under validation. Any relevant version of the hardware or software updates shall be associated with the data. Required
65 §4.3 Requirement 1 If developer-defined health tests are used in place of any of the approved health tests, the tester shall verify that the implemented tests detect the failure conditions detected by the approved continuous health tests, as described in Section 4.4. Required
68 §4.3 Requirement 2 If the entropy source detects intermittent failures and allows the noise source to return to normal functioning, the designer shall provide evidence that: a) The intermittent failures handled in this way are indeed extremely likely to be intermittent failures; and b) the tests will detect a permanent failure when one occurs, and will ultimately signal an error condition to the consuming application and cease operation. Required
69 §4.3 Requirement 2 In the case where a persistent failure is detected, the entropy source shall not produce any outputs. Required
71 §4.3 Requirement 4 The entropy source's startup tests shall run the continuous health tests over at least 1024 consecutive samples. Required
72 §4.3 Requirement 5 The entropy source shall support on-demand testing. The on-demand tests shall include at least the same testing done by the start-up tests. Required
74 §4.3 Requirement 6 Health tests shall be performed on the noise source samples before any conditioning is done. Required
76 §4.3 Requirement 7 The documentation shall include a description of the health tests, source code, the rate and conditions under which each health test is performed (e.g., at power-up, continuously, or on-demand), and include rationale indicating why each test is believed to be appropriate for detecting one or more failures in the noise source. Required
77 §4.3 Requirement 8 The submitter shall provide documentation of any known or suspected noise source failure modes (e.g., the noise source starts producing periodic outputs like 101…01), and shall include developer-defined continuous tests to detect those failures. Optional
80 §4.5 Criteria (a) If a single value appears more than ceil(100/H) consecutive times in a row in the sequence of noise source samples, the test shall detect this with a probability of at least 99%. Required
81 §4.5 Criteria (b) If the noise source's behavior changes so that the probability of observing a specific sample value increases to at least P* = 2^(−H/2), then the test shall detect this change with a probability of at least 50% when examining 50,000 consecutive samples from this degraded source. Required
89 Additional Comment 1 In compliance with SP 800-90B, vendors shall provide access to the raw outputs of the entropy source. Required
97 Resolution 1 For Section 2.2.1, the vendor shall justify why all processing occurring within the digitization process does not conceal noise source failures from the health tests or obscure the statistical properties of the underlying raw noise output from this digitization process. Required
98 Resolution 1 The tester shall provide a detailed description of all digitization processes used within the noise source and describe the format of the raw data that was tested. (See SP 800-90B Section 3.2.2 Requirement 3, and Section 4.3 Requirements 1, 6, 7, 8, and 9). Required
115 Resolution 10 Combining the outputs of the noise source copies under this provision shall be considered part of the digitization process, and so Resolution 1 shall apply. Required
119 Resolution 14 The tester shall verify that all the vendor-identified known or suspected noise source failure modes are detected by the continuous health tests included within the entropy source. (See SP 800-90B Section 4.3, Requirements 1, 7, 8 and 9). Optional
123 Resolution 18 For Section 4.5, when using simulation to argue that the developer-provided health test satisfies the requirements of Section 4.5, the developer shall specify how the data used within this simulation was created. Required only for developer defined health tests.
124 Resolution 18 To fulfill the Section 4.5 requirements using simulation, at least 1 million rounds of simulation shall be used for each simulated health test, and there shall be sufficient simulation rounds so that at least five health test failures are observed for each health test. Required only for developer defined health tests.
125 Additional Comment 3 The tester shall verify that each conditioning component’s implementation is fully consistent with the component’s design. This verification shall be performed by means of either running a computerized test developed for testing just the conditioning component (separate from the statistical testing of the noise source) or by the code review. The lab shall describe in the Entropy Test Report submitted to the CMVP the chosen method for verifying the correctness of each conditioning component’s implementation. Required
-- -- -- --

So, I can't say from all that what would satisfy lab & NIST. They mention simulation for APT/RCT tests: <If simulation is used as proof, specify how the data used within this simulation was created. At least 1 million rounds of simulation are needed for each simulated health test, and there must be sufficient simulation rounds so that at least five health test failures are observed for each health test.>

But I practically don't know simulation of what specifically is needed and how to prove that implementation behavior matches simulation. Say, NIST mandates requirement 65 for developer's alternative to APT/RCT, but it also requires on-demand tests. We need to engage with the lab.

tonyd5656 commented 7 months ago

I was asked to provide my perspective as someone who is going through a FIPS 140-3 certification currently.

Please confirm my understanding of the issue:

vsukhoml commented 7 months ago

@tonyd5656 ,

johannheyszl commented 7 months ago

Thx @tonyd5656 your second bullet states the question correctly IMO.

tonyd5656 commented 7 months ago

My comments are based on my interpretation of the requirements and experience with the lab my company has used. In the end the determination of whether or not OpenTitan meets all the requirements will be made by the certification lab used and CMVP.

That being said, the "testing" for a FIPS certification includes both the actually testing of the module and "testing" of the documentation. It seems to me that the FIPS IG D.K Interpretation of SP 800-90B Requirements # 14 could be satisfied by solely by documentation.

However, there are test requirements for the module, specifically the requirement is the tester needs to be able to cause all the defined errors and if not provide the vendor needs to provide rationale as why the error cannot be induced.

From ISO ISO/IEC 24759:

TE03.07.02: The tester shall cause the cryptographic module to enter each of the following states: as state performing manual SSP entry; a self-test state preforming pre-operational self-tests; a state performing software/firmware loading; a state performing zeroisation; an error state;

If it is not possible for the tester to cause an error then the vendor shall provide rationale to the tester why this test cannot be performed. In such case, the tester shall follow alternative procedures allowed by the validation authority to ensure that all data via the data output interface is inhibited.

Based on my understanding there are two options if the ability to inject data into the health tests is not implemented:

  1. Provide a rationale as to why the error cannot be induced (e.g., it introduces an unacceptable security risk).
  2. The document does not define exactly what it means to cause an error so it seems like it could be acceptable to cause the error in a different way (e.g., directly setting the health test status to fail).

Also to note, the "simulation" mentioned in the Entropy Assessment Template is with respect providing proof that Developer-Defined Alterative health tests meet the two criteria. Proof that the Developer-Defined Alterative health tests meet the criteria is only required if they replace the APT/RCT health tests. This isn't the case for OpenTitan, so this should not need to be provided.

vsukhoml commented 7 months ago

So, I checked with the lab and got following answer:

For the NIST ESV certification there is no requirement to demonstrate APT/RCT health checks via operational testing. There is no requirement to force failures similar to what is done as part of the FIPS 140-3 module testing. Its mostly a paper exercise where the author justifies the design and cutoff values for the tests in the context of the entropy source. The rationale would address what happens in a failure scenario and then a source code review can demonstrate the design is consistent.

However, BSI entropy testing requirements are different, and should be taken into consideration if needed.

johannheyszl commented 7 months ago

Thanks @tonyd5656 and @vsukhoml !

vogelpi commented 6 months ago

Thanks both for your feedback @tonyd5656 and @vsukhoml , it's very much appreciated!

So to conclude, we don't need add such an interface to test the health tests for SP 800-90B. And for the NIST cryptographic module validation program (assuming the error cases are also relevant for the ENTROPY_SRC and not just the crypto blocks), it should be sufficient to produce health test failures e.g. by configuring aggressive health test thresholds.

I am thus suggesting to close this issue without implementing design changes. But before we do that let's also check for BSI requirements regarding this.

vsukhoml commented 6 months ago

As for BSI requirements - I have little experience, from what I found a lot is based on stochastic model, though there are places with hw requirements, and in P14 below I found requirement which seems to be directly related to this issue.

Developer evidence for the evaluation of a physical true random number generator: P.5. Description of the PTRNG in terms of the total failure test (tot test) of the raw random entropy signal. The total failure test shall consider the physical principle of the entropy source.

P.6 Description of the PTRNG in terms of the online test(s) of the raw random numbers.

P.14 Demonstrate how the total failure test of the entropy source together with the online test of the raw random signals protect the PTRNG from tampering.

P2.f) to be specified by the applicant in addition to C.1(i)-(iii) and P1.f)(iv)-(vi): As of evaluation level E2, C1.(ii.b) requires the applicant to provide proof that the statistical tests described in P2.i)(vii) have been performed and to submit the test results. If the digitised noise signal sequence does not meet criterion P2.d)(vii) or if the digitised noise signal sequence cannot be tested, the applicant must specify alternative mechanisms and demonstrate their effectiveness. (See P2.d) ”Alternative criteria for P2.d)(vii); type 1” and ”Alternative criteria for P2.d)(vii); type 2”.)

P2.f)(xi) Proof that the tests specified in P2.i)(xii) have been performed and provision of the test results. P2.f)(xii) Proof that P2.d)(xiii) is met. Moreover, the consequences of the noise alarm must be described (shutdown of the noise source, intensive tests on the noise source, logging, etc.). If the noise source is operated again following a noise alarm, it must be ensured that the digitised noise signals do not have any unacceptable statistical weaknesses.

Update of the AIS 20 / 31, Slide 14: verification that the online test and the total failure test are effective

A Proposal for Functionality Classes for Random Number Generators Version 2.35 - DRAFT:

  1. [PTG.2.4] The start-up test shall be applied when the RNG is started after the TOE has been powered up, reset, rebooted, etc. or after the operation of the RNG has been stopped (e.g., to reduce the power consumption of the TOE). The start-up test shall detect a total failure of the physical noise source and severe statistical weaknesses; cf. Subsect. 4.5.5. The start-up test might apply the online test, possibly with different evaluation rules; cf. Subsect. 4.5.5.
  2. [PTG.2.5] When the PTRNG is in operation, the online test shall detect if requirement PTG.2.3 (or PTG.2.1, or PTG.2.2) is violated. If a defect occurs, it should usually affect the requirement PTG.2.3. This cannot (or at least not reliably) be achieved by blackbox testing without considering the nature of the physical noise source. Instead, the online test shall be tailored to the stochastic model and its effectiveness shall be proven on the basis of the stochastic model.
  3. [PTG.2.5] The online test may be applied continuously, at regular (short) intervals, or upon specified internal events. The analysis shall take into account the calling scheme of the online test in the verification of its suitability. The applicant shall specify the consequences of a noise alarm. This is also a subject of the evaluation. For general considerations, further explanations, and examples we refer to Subsect. 4.5.3.
  4. [PTG.2.6] A total failure of the physical noise source implies that without intervention requirement PTG.2.3 would drastically be violated (e.g. because the next raw random number bits have no entropy at all or at best very low entropy). If the internal random numbers are buffered before they are output, then this feature can relax the detection and reaction time. The effectiveness of the total failure test shall be proven on the basis of a substantiated failure analysis of the physical noise source and the impact of the algorithmic post-processing on the entropy (cf. par. 285). The total failure test may include statistical tests, but other solutions (voltage sensors etc.) may be acceptable as well. For general considerations, further explanations, and examples we refer to Subsect. 4.5.4.
  5. [PTG.2.5, PTG.2.6] If the total failure test and / or the online test are not part of the TOE but are to be implemented later as an external security measure, then the applicant must submit an accurate specification of the online test and / or of the total failure test as well as a reference implementation. The tasks concerning the verification that PTG.2.5 and / or PTG.2.6 are fulfilled remain unaffected. The specification of the tests shall be part of the user manual (guidance documents). The online test of the final PTRNG implementation shall exactly fulfill the specification of the user manual (to be checked later in a composite evaluation) in order to be PTG.2-compliant.
  6. [PTG.2.7] The statistical test suites Trrn and Tirn shall be applied under representative environmental conditions (cf. par. 308). Depending on the PTRNG design, the developer or evaluator may apply further statistical tests. For the functionality class PTG.2, the importance of comprehensive statistical tests is incomparably higher than for the classes DRG.2, DRG.3, DRG.4, and PTG.3, because the raw random numbers may be biased or have short-term dependencies.
vogelpi commented 6 months ago

Thanks @vsukhoml for your input. After studying this, my understanding of the BSI requirements is that:

So to summarize: also BSI doesn't seem to require such an interface. I suggest to close this issue. FYI @johannheyszl

vsukhoml commented 6 months ago

So, right now it is not high priority - for software path we can implement health checks in software and prove whatever required about them if health checks won't satisfy certification requirements.

I'd propose to turn it into feature request for the future version. Certification requirements for entropy sources are frequently updated in recent years, and the requirement to be able to directly validate health checks is rather reasonable.

johannheyszl commented 6 months ago

Thanks all.

The entire topic would not be a 100% show-stopper since a FW path is possible (as @vsukhoml pointed out).

FIPS does not seem to require the additional data input to HC feature.

BSI: thanks @vsukhoml for providing references. The first doc, which is listed on the BSI page here, also states: "P.11 Demonstrate the correct implementation of the deterministic part. • The correct implementation could be demonstrated by means of known answer tests." Hence, it is not mandatory.

Summary: I think chances are high we are correct to assume, we will not need the feature at this moment in time. Thanks everybody for the help!

Outcome: I think we can relabel as improvement for future releases @vogelpi @vsukhoml

vogelpi commented 6 months ago

Thanks for your additional input @vsukhoml and @johannheyszl . I've now created an issue to keep track of all potential features for future releases here where I've also linked this issue.

I am now closing this one as we track the item / feature in the newly created issue.