lowRISC / opentitan

OpenTitan: Open source silicon root of trust
https://www.opentitan.org
Apache License 2.0
2.58k stars 776 forks source link

Xcelium default XPROP setting not correct #9661

Closed nikola-miladinovic closed 2 years ago

nikola-miladinovic commented 2 years ago

Please investigate settings for XPROP at Xcelium. There is an infinite loop using Xcelium with enabled XPROP switch (default value) which is not seen using VCS. Solution for this was adding switch -xfile and xfile itself in which i put SCOPE tb.dut F. Xfile switch overrides xprop "xmelab: *W,XPOPT: If both -XFILE and -XPROP options are present, -XPROP will be ignored." Branch which can be used only for observation is [flash_ctrl_multi_ral] (https://github.com/nikola-miladinovic/opentitan/tree/flash_ctrl_multi_ral). Command which can be used for trying out: ./$YOUR_REPO/util/dvsim/dvsim.py hw/ip/flash_ctrl/dv/flash_ctrl_sim_cfg.hjson -i flash_ctrl_csr_hw_reset --tool xcelium --fixed-seed 1 --build-opts "-xfile /$YOUR_REPO/opentitan/hw/ip/flash_ctrl/dv/tools/xcelium/xfile" - please try with and without last part --build-opts -xfile.....

weicaiyang commented 2 years ago

@nikola-miladinovic the XPROP setting may not be the issue, since the same setting is enabled in several blocks. Can you dig a bit further to find out if there is some X that causes the infinite loop?

@rasmus-madsen is there any option in xcelium that can help to debug infinite loop?

rasmus-madsen commented 2 years ago

@nikola-miladinovic as @weicaiyang says this does not seem to be a problem with the xprop, but a design bug which is triggered by propagation.

I will see if I can get some hints to as what is going on. in the mean time you can add the option --xprop-off to run without propagation.

@weicaiyang can you confirm if this fails with vcs also, I do not have access to an xprop license for vcs so I can not run VCS with xprop enabled.

nikola-miladinovic commented 2 years ago

Hi guys, simulation is working and test is passing on VCS. When i override settings with xfile, it also works on Xcelium. I still believe that problem is settings for Xcelium.

rasmus-madsen commented 2 years ago

@nikola-miladinovic using the -xfile option effectively disables xpropagation without a configuration file. (as D is default mode) so you should use --xprop-off instead

nikola-miladinovic commented 2 years ago

@rasmus-madsen : -xfile option together with xfile (in this case xfile has only one line SCOPE tb.dut F) overwrites xprop default command and XPROP is enabled (I can see this at build.log). XPROP is mandatory according to documentation and we should not use --xprop-off. @sriyerg : please state if we are missing something here.

rasmus-madsen commented 2 years ago

@nikola-miladinovic all I am saying is using -xfile without an argument is the same as using --xprop-off but using --xprop-off is better - since it is obvious that propagation is not active. -xfile can give the illusion that some propagation is still happening. hence the reason not to use it.

I am currently debugging the propagation.

nikola-miladinovic commented 2 years ago

@rasmus-madsen : Ok, thanks for clarifying things.

rasmus-madsen commented 2 years ago

@nikola-miladinovic I have gotten an additional piece of the puzzle. there is definitely a combinatorial loop in the code. it looks like it is ´flash_ctrl_lcmgr.sv´ and we are actually not running the same XPROP configuration. for VCS we are only propagating on the DUT. from xprop.cfg

merge = xmerge;

// Turn on xprop for dut only
instance { tb.dut } { xpropOn };

@sriyerg is this on purpose?`I think we want to propagate X' to the lower level modules also inside the IP? either way I think we want to make this match for both simulators

weicaiyang commented 2 years ago

@nikola-miladinovic I have gotten an additional piece of the puzzle. there is definitely a combinatorial loop in the code. it looks like it is ´flash_ctrl_lcmgr.sv´ and we are actually not running the same XPROP configuration. for VCS we are only propagating on the DUT. from xprop.cfg

merge = xmerge;

// Turn on xprop for dut only
instance { tb.dut } { xpropOn };

@sriyerg is this on purpose?`I think we want to propagate X' to the lower level modules also inside the IP? either way I think we want to make this match for both simulators

@rasmus-madsen dut includes all the RTL. I think this should include lower level modules as well.

I found this about using instance for xprop from VCS user guide.

instance_item Apply the specified Xprop mode to all instances in the list and recursively to all the sub-instances.

nikola-miladinovic commented 2 years ago

Seems that my settings on Xcelium SCOPE tb.dut F also checks only dut since i am missing (actual line in xfile should be SCOPE tb.dut F). adding @tjaychen to conversation due to combinatorial loop in the code at ´flash_ctrl_lcmgr.sv´.

rasmus-madsen commented 2 years ago

@nikola-miladinovic so I am now convinced the problem is in the flash_ctrl_lcmgr.

a temporary fix would be to add .--build-opts="-delay_trigger" to xcelium runs. this will schedule the combinatorial process after the active region in the schedule and hence won't retrigger. unfortunately this can reorder things where a race-condition is present (anywhere we have #1ps or similar)

@tjaychen verilator reports several combinatorial loops in the flash controller - as far as I can tell none of them related to the xprop issue, but they should be sorted out I think!

rasmus-madsen commented 2 years ago

@tjaychen @nikola-miladinovic I have root caused the issue and xcelium is definitely correct in this case, the problem is there is a combinatorial loop between the two always_comb processes. where rma_wipe_req is set in the first and evaluated in the second and rma_wipe_done is evaluated in the first but set in the second. written in short this is the loop


always_comb begin
rma_wipe_req = 0;
  if(rma_wipe_done) begin
    rma_wipe_req = 1;
  end
end

always_comb begin
  rma_wipe_done = 0;
  if (rma_wipe_req) begin
    rma_wipe_done = 1;
  end
end

this does not cause any problems when there is no Xpropagation. but as soon as you introduce Xpropagation you introduce glitches on both variables ie. rma_wipe_req 0->X->0 the x will trigger the second process which will then trigger the first process etc.

this is obviously a simplification - but it explains the issue well. (the state variable is also x which causes all the case statements to be evaluated)

I will create a PR with a temporary patch - but these huge combinatorial processes should be split into smaller ones

I am a bit baffled that VCS does not catch this - it must be making assumptions that we do not want it to! this really shows the value of using multiple tools

tjaychen commented 2 years ago

@tjaychen @nikola-miladinovic I have root caused the issue and xcelium is definitely correct in this case, the problem is there is a combinatorial loop between the two always_comb processes. where rma_wipe_req is set in the first and evaluated in the second and rma_wipe_done is evaluated in the first but set in the second. written in short this is the loop


always_comb begin
rma_wipe_req = 0;
  if(rma_wipe_done) begin
    rma_wipe_req = 1;
  end
end

always_comb begin
  rma_wipe_done = 0;
  if (rma_wipe_req) begin
    rma_wipe_done = 1;
  end
end

this does not cause any problems when there is no Xpropagation. but as soon as you introduce Xpropagation you introduce glitches on both variables ie. rma_wipe_req 0->X->0 the x will trigger the second process which will then trigger the first process etc.

this is obviously a simplification - but it explains the issue well. (the state variable is also x which causes all the case statements to be evaluated)

I will create a PR with a temporary patch - but these huge combinatorial processes should be split into smaller ones

I am a bit baffled that VCS does not catch this - it must be making assumptions that we do not want it to! this really shows the value of using multiple tools

hmm... i'm not really convinced there is a loop. Especially since our other lint tools (both the ones we use for sign-off, and the one Nuvoton uses) has identified it as such. I also cannot identify at the moment, any area of the code where what you're saying is happening. But I'll have a closer look.

tjaychen commented 2 years ago

actually..the fix you guys put in..pretty much very clearly states there isn't a real combinational loop, but a tool one. Since you're literally just taking one signal and unconditionally assigning it to another... from a real circuit perspective there is no difference between the signals. Ie, synthesis would map them to the same thing.

So we're creating an artificial break for the tool, not a real thing. I'm going to revert this change.

tjaychen commented 2 years ago

i double checked with ascentlint once more, there are no identified combo loops within flash controller. We need to take this back with the vendors again, and ask for more details.

rswarbrick commented 2 years ago

@tjaychen: This is a genuine loop at the SystemVerilog level, but will never cause problems in actual synthesised code (so presumably lint tools don't bother warning about it). For a simplified version, consider the following:

logic i_am_x; // this gets a value from somewhere else
logic a;
logic b;

always_comb begin
  a = 0;
  case (i_am_x)
    1'b1: a = b;
    default: a = 0;
  endcase
end

always_comb begin
  b = 0;
  case (i_am_x)
    1'b1: b = a;
    default: b = 0;
  endcase
end

What happens if i_am_x is x? The first process runs, setting a to zero and then setting it to x (because the case statement is switching on an x). But the change in a triggers the second process, because the outputs of that process depend on a. But now the same happens: b gets set to zero and then to x again. But that triggers the first process... Oh no! A simulator should loop here: it's actually mandated in the spec.

As far as synthesis is concerned, this code is equivalent to either of the to the following re-workings (I'm just rewriting the first process). They will not trigger the loop in simulation.

// Option 1: Don't set a before the case statement
always_comb begin
  case (i_am_x)
    1'b1: a = b;
    default: a = 0;
  endcase
end

// Option 2: Use an assign statement instead
assign a = i_am_x & b;

Rasmus's fix is even hackier. He's done the equivalent of this:

logic i_am_x; // this gets a value from somewhere else
logic a, aa;
logic b;

always_comb begin
  a = 0;
  case (i_am_x)
    1'b1: a = b;
    default: a = 0;
  endcase
  aa = a;
end

always_comb begin
  b = 0;
  case (i_am_x)
    1'b1: b = aa;
    default: b = 0;
  endcase
end

Note that, unlike a, the aa signal doesn't glitch between zero and x, which means that the second process doesn't get triggered spuriously.

tjaychen commented 2 years ago
logic i_am_x; // this gets a value from somewhere else
logic a;
logic b;

always_comb begin
  a = 0;
  case (i_am_x)
    1'b1: a = b;
    default: a = 0;
  endcase
end

always_comb begin
  b = 0;
  case (i_am_x)
    1'b1: b = a;
    default: b = 0;
  endcase
end

i don't really understand what you are arguing about. The first example you used, that IS a combinational loop in the design and the tool WILL flag it. I plastered that example directly into lint, and here are the results.

The construction you are showing, even simplified, should not exist in flash_ctrl right now, otherwise it would have already been flagged by lint tools everywhere or even fpga synthesis. If the tools are not flagging it, we have a much bigger issue.

E   COMBO_LOOP:   flash_ctrl_lcmgr.sv:710   'a' is in a combinational loop with following points:                                                 New                            

E   COMBO_LOOP:   flash_ctrl_lcmgr.sv:710   'a' driven in module 'flash_ctrl_lcmgr' by 'b' at flash_ctrl_lcmgr.sv:710   flash_ctrl_lcmgr.sv:710   New                            

E   COMBO_LOOP:   flash_ctrl_lcmgr.sv:710   'b' driven in module 'flash_ctrl_lcmgr' by 'a' at flash_ctrl_lcmgr.sv:718   flash_ctrl_lcmgr.sv:718   New     
weicaiyang commented 2 years ago

sorry, late to the party. Looks like this issue is similar to this one #1548 that we met 1.5 years ago. We ended up to change the code to avoid it. I think it's more like a tool bug, but Cadence doesn't believe so. They provided an option -delay_trigger. Below is the description from Cadence.

About -delay_trigger:

  • The ‘-delay_trigger’ command line option filters out many of these glitches by performing an extra step before executing a newly triggered always-block. It checks to see if any variable in the event control expression actually has a different value from when the block was last executed.
  • As zero delay loop mainly due to RTL coding style or use of always@ which will implicitly take all control and RHS as sensitivity list. i.e. assignment in one always@ can trigger another always@* and vice versa in the same time frame.
  • Default behavior refers to LRM compliant and in ideal situation code should not have any zero delay glitches.
  • Apart from LRM compliant, Switch is also not enabled by default due to impact on performance.
  • It is recommended to develop the code in such a way that this zero-delay loops are not being formed.
  • delay_trigger is an optional switch to the situation where code can’t be touched.

This is not a tool bug but valid behaviour. "delay_trigger" switch helps to avoid combinational loops and we recommend to use it for such purpose. As I emphasised earlier, its not xprop specific at all. Its a general behaviour and even in normal simulations (non xprop) you can encounter that. In fact, we have discussions going on internally to make this switch default. Moreover synthesis will show the same behaviour as the "delay trigger" switch. So, there should not be any issue as such.

If we believe our code is fine, we can consider to use this switch. @rasmus-madsen did you use this switch in some other projects?

tjaychen commented 2 years ago

@weicaiyang actually by adding an assignment outside of the always block, the problem goes away (if you follow my latest PR). It would be good if someone can double check. @nikola-miladinovic said it was fine, but I couldn't tell by his comment if he meant with or without xprop.

My point in this case is that there is no real design combo loop. Especially since rma_wipe_req is affected only by a state variable, so I really don't understand where the "loop" is coming from. If there's a real loop, and our tools are not flagging it, we are in real trouble.

The example that @rswarbrick showed is a real loop, so that's not the problem here.

weicaiyang commented 2 years ago

@weicaiyang actually by adding an assignment outside of the always block, the problem goes away (if you follow my latest PR). It would be good if someone can double check. @nikola-miladinovic said it was fine, but I couldn't tell by his comment if he meant with or without xprop.

My point in this case is that there is no real design combo loop. Especially since rma_wipe_req is affected only by a state variable, so I really don't understand where the "loop" is coming from. If there's a real loop, and our tools are not flagging it, we are in real trouble.

The example that @rswarbrick showed is a real loop, so that's not the problem here.

I agree this is not a real loop. We can change the code to avoid it like you did in the PR. If you prefer not to change the code, then we can consider using the switch Cadence provides

tjaychen commented 2 years ago

no changing is fine... i just wish we could identify exactly what the problem is. And maybe update our style guide to not do something in the future. Because I honestly can't tell right now.

On Tue, Dec 21, 2021 at 12:24 PM weicaiyang @.***> wrote:

@weicaiyang https://github.com/weicaiyang actually by adding an assignment outside of the always block, the problem goes away (if you follow my latest PR). It would be good if someone can double check. @nikola-miladinovic https://github.com/nikola-miladinovic said it was fine, but I couldn't tell by his comment if he meant with or without xprop.

My point in this case is that there is no real design combo loop. Especially since rma_wipe_req is affected only by a state variable, so I really don't understand where the "loop" is coming from. If there's a real loop, and our tools are not flagging it, we are in real trouble.

The example that @rswarbrick https://github.com/rswarbrick showed is a real loop, so that's not the problem here.

I agree this is not a real loop. We can change the code to avoid it like you did in the PR. If you prefer not to change the code, then we can consider using the switch Cadence provides

— Reply to this email directly, view it on GitHub https://github.com/lowRISC/opentitan/issues/9661#issuecomment-999071043, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAH2RSVZCKWHU665SQVU4C3USDPAZANCNFSM5J56XBAQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

rswarbrick commented 2 years ago

I'm not sure you're quite right in your diagnosis. In the version that was failing, there are processes starting on line 272 (let's call that one "process A") and on line 576 (let's call that one "process B").

Process A writes rma_wipe_req with the "set it to zero, and then set it to something else in one branch of a case statement" logic. If state_q happens to be X then every time process A runs, rma_wipe_req will be set to zero and then set to X.

Process B depends on rma_wipe_req, so is triggered by process A. It writes rma_wipe_done, first to zero and then to something that depends on rma_state_q. Again, this is X, so you get a switch between zero and X.

Process A depends on rma_wipe_done. Infinite loop.

I think this is exactly the same as the example I described above. Maybe AscentLint spots one and not the other, but Xcelium is completely correct to hang if both state_q and rma_state_q are X.

rswarbrick commented 2 years ago

Sorry, I didn't make it completely clear what I was replying to. @tjaychen: Please can you have a look at the code and check whether I'm talking nonsense? My super-simple example above was supposed to be a distillation of what was going on.

I think there are two possibilities: (1) I'm missing something and the code doesn't work the way I thought. (Totally possible!) (2) AscentLint isn't spotting every such loop.

tjaychen commented 2 years ago

There is no "real" combo loop here. There may be one as interpreted by the simulator, but I don't agree with it. There is one very BIG difference between the example you used and the code in flash_ctrl_lcmgr

always_comb begin
  a = 0;
  case (i_am_x)
    1'b1: a = b;
    default: a = 0;
  endcase
end

In the example you used, a is dependent on b. However in the flash code, a is directly assigned to 1 whenever it is in that particular state. The a = 0 above is a pretty typical way of telling the synthesizer that "in all conditions other than the one I specify, please set a to 0" (you can also find this in our style guide).

always_comb begin
  a = 0;
  case (i_am_x)
    2'b1: a = 1;
    2'b2: //something
    default: a = 0;
  endcase
end

The way the linter and synthesis tools interpret this then, is to basically create the wire a = (i_am_x == 1'b1).
So in the design, the value of rma_wipe_req is directly controlled by the value of the flop, it is not actually controlled by the value of rma_wipe_done. rma_wipe_done controls the transition of the state to another value, which on the next cycle will then change the value of rma_wipe_req. In the real world, there is no circular path from done to req here.

So yes, it is possible for the simulator to interpret the way you are describing, but that is contrary to the actual intent of the design, My last point remains this... My latest PR, the ONLY change I made, is put rma_wipe_req through a buffer, and put the output of that buffer into the second always block. This functionally does not change ANYTHING. If there were a real loop before, there is still one now.. a buffer doesn't change anything.

What this means is again, once synthesized, there is no loop. This is coming because of the way the simulator is handling things. I am inclined to say it's wrong, but others can disagree.

What I am trying to emphasize is that there is no real loop here, and if there were, we would have a very big problem with our linters not being able to catch them. I consider this a tool issue because the simulated behavior is not the same as the synthesized behavior, and I would align with the latter.

tjaychen commented 2 years ago

lastly, i doubt whatever i typed here has convinced anyone. If you all are that convinced i'm wrong, the linter is wrong, the synthesis results are wrong, let's just do whatever you guys prefer to move things along.

Have a look at #9771 and let me know if you are okay with that fix. If not, we also cannot use any of the proposed fixes here, so we'll need to look for something else.

for option 1, it is not a good idea to avoid early assignments before the case. Other synthesis results have shown this can accidentally tell the tool that the value in certain states don't matter. And I do not want to make default assignments in every state (and yes this is not covered by the default state).

for option 2, it is bad style in my view to use the state variable outside of the FSM, because later when someone makes changes it's easy to forget things outside of the FSM.

// Option 1: Don't set a before the case statement
always_comb begin
  case (i_am_x)
    1'b1: a = b;
    default: a = 0;
  endcase
end

// Option 2: Use an assign statement instead
assign a = i_am_x & b;
rasmus-madsen commented 2 years ago

if we believe our code is fine, we can consider to use this switch

Well the tool tells us it is not fine - why are we working so hard to modify the tool behavior when we can fix the code, and have the tools agree on the behavior?

I think the fact that we have two tools that agree is a big gain, and we should try to avoid adding special options, or workarounds on the tool itself if at all possible to fix in the code.

that being said I also agree with Rupert's comment - this is an real loop. unless I am missing something obvious! Cadence FAE confirms this is flagged on purpose. Maybe we just disagree on what "real" means in the "real" combinatorial loop claim?

either way in previous situations, we have updated the code to work for both tools rather than trying to force one tool into a specific behavior.

have we activated propagation in ascent? I took a quick look and I don't see it but I might be looking in the wrong place. I think it is default off - which could be the explanation to why this is not caught. @weicaiyang maybe you can check that?

.@rasmus-madsen did you use this switch in some other projects?

I have never run into these problems before - but I can read that the option potentially changes code behavior as some things are pushed to a different scheduling stage that cannot retrigger events.

tjaychen commented 2 years ago

i agree we don't need a switch. However, i disagree that is a real loop. As I'm trying to explain here, this is NOT a loop in terms of real circuits. It might be from the perspective of the simulator, but that is my opinion an oddity, and again, this would not synthesize into a real loop.

Like i tried to explain, if by adding a buffer the "loop" goes away, then it is not REAL in the synthesized sense, but it may be for the simulator. We can put an assignment statement for that. But again, I 100% disagree this is a real loop, otherwise lint would have flagged it.

tjaychen commented 2 years ago

summarizing a bit, I think what we're saying now is..

  1. This is a real loop for cadence, based on how that simulator works.
  2. This (i believe) is not a real loop for synthesis.

The first we can work around very easily, although it sounds like maybe we should add some style guide updates to not directly crisscross signals between FSMs without going through some kind of assign element (my guess is if we through ports it also doesn't matter, since this kind of behavior definitely exists elsewhere).

The second I will double check via other tools, but I'm fairly certain.

Does that sound okay to everyone?

rasmus-madsen commented 2 years ago

@tjaychen @rswarbrick so after giving this some more thought I think there is a semantical difference here - which is why we disagree!

I think we all agree that this is a simulation issue - not a synthesis issue. and the reason for this is depending on the scheduling - of which there is none in real hardware.

so the loop here is probably better named an EVENT LOOP rather than a combinatorial loop. because it is the event schedule that loops due to the x.

I don't think that changes a whole lot about the solution space - but it might make us agree more on what the problem is. I still believe that xcelium is correct about flagging this as an issue. as the code in the case of X is ambiguous. but I think @tjaychen is correct in claiming this is not truly combinatioral.

in any case, I still believe the correct solution would be to update the code to remove any ambiguities.

tjaychen commented 2 years ago

yes agreed I will update the code. (please see my latest comment there, I will also reword the PR to remove mention of "tool" issues).

I also think per @rswarbrick examples earlier, we need to come up with some kind of style guide suggestion to head these things off in the future. Can you guys think of any general description that would capture this well?

This sort of thing where one FSM triggers another, which then sends a "done" back to advance the state is actually quite common, so I don't know exactly how to word this. Should we say that if two FSM's in the same module are talking to each other, we should have temporary variables (assign statements) to breakup the signals?

rasmus-madsen commented 2 years ago

not sure what the general recommendation would be. but I would suggest using assigns or setting the other branch leg of an if. the a=0 if (b) a = 1

is causing the event loop - so rewriting this to if(b) a = 1 else a = 0

would also fix the issue.

lastly splitting the processes into smaller independent ones would also solve this.

@rswarbrick any preferences or suggestions?

rswarbrick commented 2 years ago

Yeah, I'm not sure about a good style suggestion here. In the case of this code, the best thing is probably to split out the rma_wipe_req signal, since there's nice way to write it without needing another case statement. I can definitely think of situations where it's not so obvious how to do this nicely. :-(

@tjaychen: I completely agree that this is a spurious silliness caused by an interaction between X propagation rules and the SystemVerilog scheduling semantics. It's a bit unfortunate that the language we're using to describe our hardware has this footgun built in. But it does. And many others! So I guess we need to treat this like we code for linters: write code in a constrained subset of the language that doesn't trigger any of the over-eager lint checks, or overly-stupid simulator schedulers. Doing so without tool support is almost impossible so we should expect to run into this sort of thing whenever we run code with a new tool. Such is life!

Finally, I'm still a little bit surprised that Verilator doesn't spit out UNOPTFLAT warnings about this construct, since there is a genuine loop in the front-end RTL (even though it will never cause a problem post-synthesis). I might try to make a smaller example and raise a bug report in the new year.

nikola-miladinovic commented 2 years ago

@weicaiyang @rasmus-madsen Please do not forget to create and mention here VCS issue for XPROP settings. If I understood correctly, we should test XPROP inside all instances of DUT, not only at file DUT itself.

weicaiyang commented 2 years ago

@weicaiyang @rasmus-madsen Please do not forget to create and mention here VCS issue for XPROP settings. If I understood correctly, we should test XPROP inside all instances of DUT, not only at file DUT itself.

@nikola-miladinovic as I replied earlier, we already enabled all instances of DUT to test with xprop for VCS. VCS doesn't interpret this as a dead loop. It behaves more align with synthesis tools.

tjaychen commented 2 years ago

coming back to this a bit late, but i will update per the suggestions and make it clearer in the code what was the problem we encountered.

cdgori commented 2 years ago

For whatever it is worth, this has been a major divergence point between VCS and $(Cadence sim) for many many years (10+ at least), because I have encountered this exact problem in the past (two interlocked always statements each using the set-output-to-default-at-the-top-of-the-block idiom). VCS either recognizes this construction or does some kind of loop-squashing here.

I also agree with Tim that it's not a real loop, X isn't a real value in hardware. But we do need to be able to sim for a great many reasons so Xprop analysis and fixing becomes a reality. If you want to say "avoid this idiom" in the RTL style I would be ok with it - I can make arguments for and against having all outputs assigned a known value at each terminal branch of conditionals in a block. Most of the arguments against are now caught by tools (the most common one is when someone forgets to assign one output and latch inference is detected), so I'm mostly in favor of avoiding the default-at-the-top idiom these days.