lowRISC / opentitan

OpenTitan: Open source silicon root of trust
https://www.opentitan.org
Apache License 2.0
2.57k stars 770 forks source link

Explicit X's in case statements/ternary statements #366

Closed martin-lueker closed 4 years ago

martin-lueker commented 5 years ago

In commenting on a PR with ternary statements, I noticed an OpenTitan practice that I find alarming: Setting variables to X in the case of unhandled defaults. This is also recommended in the style guide: https://github.com/lowRISC/style-guides/blob/master/VerilogCodingStyle.md#dont-cares-xs

Now this is great in simulation, don't cares propagate through the whole system and cause assertions to fire everywhere.

However in a real device (ASIC or FPGA, doesn't matter) a silently uncaught X can cause a disaster! Furthermore in silicon, cosmic ray events, or external tampering (!) can cause states to fall outside their usual allowed enumeration values, and these are exactly the kind of events that one should catch in the defaults.

If there are defaults that we think should "never happen" shouldn't we explicitly catch them and raise an exception or an alarm?

eunchan commented 5 years ago

I remember we had a such discussion before. @cdgori can comment again here. I am an opposite of assigning unknown value in the default statement in case even it let the compiler optimizes the conditions. Currently I am explicitly assigning the default value in the default claus. Also considering adding an alert if the IP is crypto related.

msfschaffner commented 5 years ago

I totally agree with @eunchan . Although there are design guidelines that (if you follow them properly) can make X'es work well in an SV codebase, I am not a big fan of that. I typically always provide a default assignment that catches parasitic states, and try to explicitly assign a known default value in cases where one could use X'es. It is true that some tools could take advantage of X'es and optimize some of the logic, but I am not entirely convinced it really makes a big difference (unless it is a very special part of the design like a really big LUT or so). I've also seen X'es break toolflows that involve logic equivalency checking, hence I advocate a design guideline that discourages the use of X'es, if possible.

tjaychen commented 5 years ago

adding to what @eunchan said, our discussions with @cdgori have mostly centered around making our design behavior very deterministic to ensure there are no 'don't care' areas where malicious logic or unexpected behavior can hide.

There's a fair amount of industry literature for security design that talks about unspecified space being used to hide trojans or in general act as a "boot strap" step for further attacks. Chris can you comment more? Should this be a something we should try to remove from our style guide?

sjgitty commented 5 years ago

Good issue, @martin-lueker , thanks for raising. This style-guide recommendation was reviewed long ago prior to many folks' joining, and warrants a broader discussion, especially in light of security concerns. The rationale for the recommendation comes from a verification mindset of sniffing out problems early - X's tend to propagate and thus violations of expected case statement norms are caught early. But there are other ways of doing this, moreover, there are different ways of thinking about the problem when the mindset is these can be caused by security attacks.

I recommend we let this issue fall out of a more broad discussion of how to handle unknowns / unexpected states in the scenario of hardware attack. Perhaps you and @cdgori and other interested parties could discuss the topic and propose guidance for both the style guide, and other design best practices?

I'll create a new topic on that, and link this one here to be resolve after that one has resolved.

cdgori commented 5 years ago

My view on best-practice was actually to do an "`ifdef simulation" construct around those X assignments (I did find that the tainting you get from the X's in simulation makes things tip over faster / gives easier debug, typically), possibly with an assert/warning issued in the simulation for triggering an unexpected default, and then make sure that on the other side of the ifdef there was ALWAYS a defined safe value for synthesis/implementation, because you do not want to rely on the implementation map of X->optimal value, it's can end up doing things you don't want (e.g. "the cheapest boolean thing was to expose this sensitive value inadvertently" - ugh). In these cases there isn't really such a thing as "don't care" space, for all the reasons @eunchan / @tjaychen / @sjgitty / @martin-lueker enumerated.

This approach is generally compatible with LEC, since the main function of LEC is to check the synthesis/other physical optimization stages for issues. Obviously, you have to be really careful about potential sim/synth mismatches. It generally commits you to doing at least some amount of gate-level simulation too, which we can discuss, but I would advocate for that too, since problem replication in silicon may depend on having that available as a tool.

As to @martin-lueker 's question about how to handle tripping over these boundaries: in some cases it might be appropriate to raise an alert if we wander into the don't-care space, in other cases, it's better to enforce back to "normal" space silently, since the adversary would like to find the contour of the space. I don't think it's possible to make a general statement about what to do, because any choice has some undesirable implications and it depends on what threat is judged to be more concerning (often different on a situation-by-situation basis).

martin-lueker commented 5 years ago

I like the `ifdef idea, @cdgori. This wouldn't be hard to use in cases where you have explicit case statements, like the expansion we discussed in PR #358. Though, as you pointed out it seems that equivalent ternary expressions are considered quite convenient for many reasons. In those cases should we perhaps have a macro with two arguments: to expand to some safe fallback in synthesis and 'bX in simulation?

tjaychen commented 5 years ago

so it sounds like overall..what we are saying is...

I have a feeling ifdefs across the design may not be very widely accepted...would it make sense to add an assertion instead of x propagation requirement for all these cases? Ie, for all cases where we would have done x propagation per the current style guide, we instead add an assertion

tjaychen commented 5 years ago

oops, sorry i did not notice Martin's last comment. Martin's suggestion makes sense to me as well, but i feel like an assert would be even faster detection.

martin-lueker commented 5 years ago

@tjaychen, just to add to your summary, though

so it sounds like overall..what we are saying is...

  • For production behavior, it is more desirable to have deterministic and well known behavior
  • For simulation / debug purposes, we would like something easy to spot.

I think there may be a third possibility:

If we choose to do a macro, we could give it an extra bit output, or perhaps somehow assign an extra signal. Again details TBD. The important part is that we keep the flexibility to record or not record the illegal state transition.
Chris would that be reasonable?

martin-lueker commented 5 years ago

oops, sorry i did not notice Martin's last comment. Martin's suggestion makes sense to me as well, but i feel like an assert would be even faster detection.

Hi @tjaychen, Sounds ok to me if we can make it work. Though can you give an example of how you might use an assert at the end of a long chain of ternary expressions? (This is one syntax that I think @cdgori and @vogelpi want to use instead of explicit cases).

tjaychen commented 5 years ago

ah that's fair, i was only thinking of FSMs. For the case you mentioned, a macro may work better. Using an assert there would be pretty inefficient i imagine..since we'd have to re-create all the conditions, or at least find the inverse of all the conditions that didn't match.

On Mon, Oct 7, 2019 at 5:10 PM martin-lueker notifications@github.com wrote:

oops, sorry i did not notice Martin's last comment. Martin's suggestion makes sense to me as well, but i feel like an assert would be even faster detection.

Hi @tjaychen https://github.com/tjaychen, Sounds ok to me if we can make it work. Though can you give an example of how you might use an assert at the end of a long chains of ternary expressions? (This is one syntax that I think @cdgori https://github.com/cdgori and @vogelpi https://github.com/vogelpi want to use instead of explicit cases).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lowRISC/opentitan/issues/366?email_source=notifications&email_token=AAH2RSQDX72M46YALERGHSTQNPFVXA5CNFSM4I6ITBS2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEASGONI#issuecomment-539256629, or mute the thread https://github.com/notifications/unsubscribe-auth/AAH2RSRBFUHPIDB6UW43UQDQNPFVXANCNFSM4I6ITBSQ .

vogelpi commented 5 years ago

Thanks @martin-lueker for starting this issue, and everyone for sharing your thoughts.

We don't have a strong opinion here about whether we should keep driving 'X or not. But it is important to discuss it.

It looks like we should stop doing that. As for how to handle the default cases, using assertions for FSMs but resolve the state to something defined sounds reasonable. For ternary statements, it's probably best to mandate that the default statement is defined (no 'X propagation/detection in simulation).

We are against using ifdefs and macros as it introduces additional dependencies in the code base which may hinder IP reuse. In general, it would be important for us to have a coding style guide that also (but not only) works for security-focused projects.

GregAC commented 5 years ago

I'd worry about no Xs at all as in general I feel they are a very useful tool for DV. When you're assigning X to something you're saying the design shouldn't care what this signal is as it should be ignoring its value. So I'd argue in some cases at least an explicit X can help expose the 'undefined' portion of the design during verification and hence reduce the areas that undefined behavior can be used for an attack.

For example the zero register in RISC-V is always 0, writes to it are effectively discarded. Let's say our decoder explicitly sets the write register address to 0 when it's not needed (rather than X) but there's a bug when the write enable gets set for instructions that aren't writing a register. This bug passes by unnoticed as the write address is always set to 0 so the write gets discarded. In hardware it turns out an attacker can manage to fiddle with the write address via external tampering and uses this as part of an exploit (or perhaps there's a very rare bug in the decoder we failed to spot in DV where in some instances it actually sets something other than 0 as the register write address when it doesn't want a register write, no external tampering required). Had we driven X as the write address in this case we would have seen early that the register write was getting asserted when it shouldn't be and fixed it (assuming we had an assertion checking write_addr != X when write_en == 1).

Though on the other side the optimizer deciding some sensitive value is the best thing to drive where it sees an X is clearly a concern and exactly the kind of thing that could happen in my above decoder example (lots of stuff going in and out of the decoder so driving outputs of it to X could maybe see something sensitive go out).

I'd agree that for FSMs having it transition to some known state when it's outside of the defined states (rather than the state becoming X) seems prudent. Along with an assertion that the state is always one of the valid values.

While I'm not a fan of macros all over the place achieving the more effective bug finding you can get with Xs whilst also avoiding the optimizer exposes sensitive value or adds fun undefined behavior may not be possible without it.

Continuing with my example if the decoder had a line like this:

rf_wr_addr = X_VALUE(5, d0);

Which would become

rf_wr_addr = 5'bx;

or

rf_wr_addr = 5'd0;

Depending on whether some name (say ALLOW_XS) was defined would allow us to achieve both.

It would work with ternary expressions too:

foo = a ? something :
      b ? something_else :
          X_VALUE(5, d0);
mwbranstad commented 5 years ago

My general experience and all ip-for-sale companies' code that I have reviewed do not propagate X's. That said, I could see the short term benefit of seeing X's in the wave as a quick debug aid. However, if this case is well defined, an assert is better in the long run, particularly for regressions. Good points to consider have been brought up in this issue, such as how does LEC like X's (I would think poorly), how do synthesizers like X's (for FSMs, optimization is generally minimal). Main concern in general is making sure synthesis results match simulation results. Probably need to ask the vendor this question. I am not sure I saw anything on the topic of coverage in this thread. Assuming a leg that has an X will never be covered. This would have to also be "discounted" in the coverage review. Design for security is new to me, so that if these unspecified cases are a concern, then perhaps a more extreme approach should be taken. One I can think of is that if that X case is taken , then a status bit is set (perhaps this is an hw alert). This case still may be uncovered, but it would seem to be more secure since firmware can react.

tjaychen commented 5 years ago

Thanks for the detailed example Greg! I think what you're describing is very similar to what Chris and Martin are describing. To Mark's point, I do think we probably want this kind of macro approach because then we can specify for non-simulation specific cases, we always pick a known value which is more friendly to EDA tools and security.

Greg / Martin Do you guys think we should document the overall proposal as a PR to the style guide for wider review?

On Tue, Oct 8, 2019 at 8:18 AM mwbranstad notifications@github.com wrote:

My general experience and all ip-for-sale companies' code that I have reviewed do not propagate X's. That said, I could see the short term benefit of seeing X's in the wave as a quick debug aid. However, if this case is well defined, an assert is better in the long run, particularly for regressions. Good points to consider have been brought up in this issue, such as how does LEC like X's (I would think poorly), how do synthesizers like X's (for FSMs, optimization is generally minimal). Main concern in general is making sure synthesis results match simulation results. Probably need to ask the vendor this question. I am not sure I saw anything on the topic of coverage in this thread. Assuming a leg that has an X will never be covered. This would have to also be "discounted" in the coverage review. Design for security is new to me, so that if these unspecified cases are a concern, then perhaps a more extreme approach should be taken. One I can think of is that if that X case is taken , then a status bit is set (perhaps this is an hw alert). This case still may be uncovered, but it would seem to be more secure since firmware can react.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lowRISC/opentitan/issues/366?email_source=notifications&email_token=AAH2RSR2CYLHKNUNBGHMICLQNSQDJA5CNFSM4I6ITBS2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAURM3Q#issuecomment-539563630, or mute the thread https://github.com/notifications/unsubscribe-auth/AAH2RSRK43THGJV3RSCZU2TQNSQDJANCNFSM4I6ITBSQ .

martin-lueker commented 5 years ago

Hi @tjaychen, @GregAC Sure I can take a crack at the style guide for that.
Though first, @vogelpi, do you have any input on how we can best achieve your code reusability goal. Are we currently really not using any macros in the RTL at all? (If we are using even one, then adding even a second one won't change anything).

-Martin

asb commented 5 years ago

Though first, @vogelpi, do you have any input on how we can best achieve your code reusability goal. Are we currently really not using any macros in the RTL at all? (If we are using even one, then adding even a second one won't change anything).

That statement came from a lunchtime conversation over at the lowRISC offices. I think I might summarise the outcome a little bit differently (and probably also colour it with my personal view!). I'm not overly concerned about relying on a few macros, it's fairly inevitable. I think the issue we all agreed on is that if macro usage gets too pervasive and/or they become too "magic" it's a barrier for understanding and contributing to the design, and it can start to feel like you're writing in a custom SystemVerilog-derived language. I'm not saying this particular macro is a step too far in that respect, but I think this general issue is one we should be wary of.

martin-lueker commented 5 years ago

Thanks @asb for your feedback. I'll make a PR with:

asb commented 5 years ago

I think that's a good way to move the conversation forwards - a concrete proposal will garner more feedback and help us come to a proper conclusion. Thanks for taking this up Martin.

GregAC commented 5 years ago
* @GregAC: Would it be possible to get an example X_VALUE macro from you?

Maybe something along these lines:

`ifdef ALLOW_X_VALUES
`define X_VALUE(safe_val) ('x)
`else
`define X_VALUE(safe_val) (safe_val)
`endif

Using it like this:

logic [7:0] foo;
assign foo = `X_VALUE(8'hAB)

In my initial example I was thinking we should pass an explicit size so the x can be given an explicit size, i.e. when we have ALLOW_X_VALUES it turns into

assign foo = 8'bx;

However I think 'x should suffice so we get

assign foo = 'x;

This also allows safe_val to be pretty much anything rather than constraining it (in my previous example size' would have been concatenated onto the safe_val).

cdgori commented 5 years ago

I sort of kicked a bit of this hornets nest and then got off to some other things, just coming back to it now, apologies for leaving it alone for almost a week without contributing feedback.

I like the macro proposal, and really appreciate the prior detailed decoder example from Greg Chadwick. I too prefer to have the X's during the DV period for exactly the reasons contemplated by that example, where the "safe" but defined value could hide another subtle bug/issue, and the existence of X's on the waveforms give a good clue that something bad is happening, and accelerate our ability to catch this (via assertions or otherwise) and resolve it.

I do want to be very very clear though that I feel that under no circumstances should we be using an X to give tools more optimization freedom (e.g. neither as a case assignment value nor selection-case value). The risk of having something sensitive unintentionally propagate out in a subtle/unexpected manner is far too high to allow this for the marginal optimization that could happen. If we don't have it already noted as something to study carefully in final signoff code review, it needs to be (basically any case statement needs to be looked at, along with any nested-if-then-else or ternary operation that has an equivalent of a default.) For the actual silicon implementation team, it's also advisable to study the resource sharing reports from synthesis to make sure the tool didn't try to get too clever and co-mingle sensitive and non-sensitive operations/values.

As far as concrete proposals (above and beyond the macro), I believe my previous projects required fully specified case statements/items (even if it made them overly verbose) AND a default, where the default case item included a "safe"/X macro value along with a simulation assert on it if for some reason it was reached - I think this is sort of similar to the comments from Martin Leuker above. We want to know immediately/urgently if something was reached that was not expected to be reached by definition/specification, since it means something is amiss/not understood.

Lastly, I think I mentioned some time ago to the Googlers that in previous designs I worked on, the equivalent of the safe_val part of Greg's proposed macro was actually a compile-time/elaboration-time constant for us. (Our flow sort of uniquified that macro construct via another pre-processor so there was a distinct version of the macro for every individual invocation/usage, and therefore every safe_val was distinct, randomly but predictably generated, and presumably unrelated to any values or case-items in the functional code paths.) I don't think we need to go to that length here, but it's worth mentioning how far down the rabbit hole you can go to try to isolate things.

On Thu, Oct 10, 2019 at 8:32 AM Greg Chadwick notifications@github.com wrote:

  • @GregAC: Would it be possible to get an example X_VALUE macro from you?

Maybe something along these lines:

ifdef ALLOW_X_VALUES define X_VALUE(safe_val) ('x) else define X_VALUE(safe_val) (safe_val) `endif

Using it like this:

logic [7:0] foo; assign foo = `X_VALUE(8'hAB)

In my initial example I was thinking we should pass an explicit size so the x can be given an explicit size, i.e. when we have ALLOW_X_VALUES it turns into

assign foo = 8'bx;

However I think 'x should suffice so we get

assign foo = 'x;

This also allows safe_val to be pretty much anything rather than constraining it (in my previous example size' would have been concatenated onto the safe_val).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lowRISC/opentitan/issues/366?email_source=notifications&email_token=AJBZS7HQRODDINWJYZJSAWDQN5DIDA5CNFSM4I6ITBS2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEA4Y5NY#issuecomment-540642999, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJBZS7HET36UEXU2TPMO7NDQN5DIDANCNFSM4I6ITBSQ .

taylor-bsg commented 5 years ago

In the coding guidelines used for BaseJump STL, BlackParrot and BaseJump Manycore, we use the macro approach proposed here and clarify that ‘X is for simulation and not for synthesis. This has served us well.

The write up is under the heading

“Use don’t cares (X) only in simulation; not in synthesis”

In:

https://docs.google.com/document/d/1xA5XUzBtz_D6aSyIBQUwFk_kSUdckrfxa2uzGjMgmCU/

msfschaffner commented 5 years ago

Welcome to OpenTitan @taylor-bsg :), and thanks for pointing us to your style guide. We will take that into consideration.

@martin-lueker what's the status on your side on this issue?

mwbranstad commented 5 years ago

Hi Michael, I have a question about an unrelated issue regarding the prim_lfsr, which shows you as author. Is this email an ok forum for you to discuss?

From: Michael Schaffner notifications@github.com Sent: Wednesday, November 6, 2019 5:25 PM To: lowRISC/opentitan opentitan@noreply.github.com Cc: Mark Branstad Mark.Branstad@wdc.com; Assign assign@noreply.github.com Subject: Re: [lowRISC/opentitan] Explicit X's in case statements/ternary statements (#366)

Welcome to OpenTitan @taylor-bsghttps://github.com/taylor-bsg :), and thanks for pointing us to your style guide. We will take that into consideration.

@martin-luekerhttps://github.com/martin-lueker what's the status on your side on this issue?

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHubhttps://github.com/lowRISC/opentitan/issues/366?email_source=notifications&email_token=AMTDD3K3FPQCI7JJSKZHF53QSNG5LA5CNFSM4I6ITBS2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDIK6TA#issuecomment-550547276, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AMTDD3LK6GY4C27DUYGBYRTQSNG5LANCNFSM4I6ITBSQ.

msfschaffner commented 5 years ago

Hi @mwbranstad,

I'll be happy to answer your question regarding to prim_lfsr, but since this is unrelated to this GitHub issue, I would like to ask you to open up a separate issue on GitHub, and assign it to me and label it as a question. You can do that by clicking here: https://github.com/lowRISC/opentitan/issues/new

martin-lueker commented 5 years ago

Hi @msfschaffner, @taylor-bsg I fully agree with you @taylor-bsg, and that is exactly where we are going with this macro. The status is that this fell on the back burner in the last couple weeks. I should have time very soon for it though. I will send an update when I make the PR's (however there will probably be two, one for the style update, and then after that is approved, and then one for deployment of the macro.). -Martin

tjaychen commented 5 years ago

@martin-lueker actually @msfschaffner and I spoke with some of our other colleagues yesterday, and due to our style requirement of requiring unique_case, this may not be quite as straightforward. I am going to summarize this quickly with @msfschaffner and report back. We will let you know soon.

mwbranstad commented 4 years ago

Hi Michael, I think I opened a github issue as your instructed. Did you get it? Since many of these processes are new to me, it is very likely I did not do something properly.

From: Michael Schaffner notifications@github.com Sent: Wednesday, November 6, 2019 7:28 PM To: lowRISC/opentitan opentitan@noreply.github.com Cc: Mark Branstad Mark.Branstad@wdc.com; Mention mention@noreply.github.com Subject: Re: [lowRISC/opentitan] Explicit X's in case statements/ternary statements (#366)

Hi @mwbranstadhttps://github.com/mwbranstad,

I'll be happy to answer your question regarding to prim_lfsr, but since this is unrelated to this GitHub issue, I would like to ask you to open up a separate issue on GitHub, and assign it to me and label it as a question. You can do that by clicking here: https://github.com/lowRISC/opentitan/issues/new

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/lowRISC/opentitan/issues/366?email_source=notifications&email_token=AMTDD3PC6FC6YCUOXJYYKLDQSNVJ3A5CNFSM4I6ITBS2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDISNFI#issuecomment-550577813, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AMTDD3PSCHQLPRCCMW37N7LQSNVJ3ANCNFSM4I6ITBSQ.

sjgitty commented 4 years ago

@msfschaffner is this closeable now that we've executed on the style and coding changes?

msfschaffner commented 4 years ago

Yes, I'll close this as soon as the coding style is merged.

msfschaffner commented 4 years ago

For reference: https://github.com/lowRISC/style-guides/pull/12