steveicarus / iverilog

Icarus Verilog
https://steveicarus.github.io/iverilog/
GNU General Public License v2.0
2.85k stars 530 forks source link

Ternary operator with constants to $display() #827

Closed bpringlemeir closed 1 year ago

bpringlemeir commented 1 year ago

Below is a test case.

module test();
   reg expected = 1;
   initial begin
      $display("Const Positive %s", 1 ? "yes" : "NO");
      $display("Const Negative %s", !1 ? "yes" : "NO");
      $display("Var Positive %s", expected ? "yes" : "NO");
      $display("Var Negative %s", !expected ? "yes" : "NO");
      $finish;
   end
endmodule

I am using the WSL2 with Ubuntu 20.04 release of the iverilog and vvp.

$ iverilog -V | head -n 1
Icarus Verilog version 10.3 (stable) ()
Unable to get version from "/usr/lib/x86_64-linux-gnu/ivl/ivlpp -V"
Unable to get version from "/usr/lib/x86_64-linux-gnu/ivl/ivl -V -C"/tmp/ivrlh337f554a" \
    -C"/usr/lib/x86_64-linux-gnu/ivl/vvp.conf""
$ iverilog -Wall -g2005-sv test.v
$ vvp a.out
Const Positive yes
Const Negative
Var Positive yes
Var Negative  NO

The Const Negative line does not supply the string 'NO'. If the constants are verilog defines, then the use case becomes more apparent.

martinwhitaker commented 1 year ago

Note that version 10.3 is very old and no longer supported. But this bug is still present in the master branch.

bpringlemeir commented 1 year ago

I don't need to have it for 10.3, I have a work around (assign the constant to a register; ok for a test bench). I was just reporting in case it was still present.

bpringlemeir commented 1 year ago

I reproduced with 22.04 version 11.0 release on stock Linux. I recompile with 12 development master and reproduced.

The VVP files all have,

%vpi_call/w 2 5 "$display", "Const Negative %s", "\000NO" {0 0 0};

Is it something to do with the parsing of "NO" or why would a NUL character be added by an optimization pass?

If I edit the file and remove the '\000', then "NO" is printed. It is not a VVP issue; but probably you knew that.

larsclausen commented 1 year ago

The rules for ternary operators are that both arguments are expanded to the width of the widest argument. In the non-optimized case a temporary variable that is a vector gets generated. E.g.

reg [23:0] v;
if (expected) v = "Yes"; else v = "No";
$display(..., v);

When assigning a string literal to a vector that is wider it gets left padded with 0s. $display then ignores those extra 0. This is the correct behavior.

The confusion now happens with the optimization pass that removes the ternary operator and replaces it with a constant. Then the code generation pass thinks it is still a string literal and generates a string with the leading '\000', which the cases the string to be ignored.

I'm not 100% sure what the right fix is, but maybe best is to remove the string literal flag from a constant when it is width expanded. Since e.g. the following works as expected.

$display("Const Negative %s", 0 ? "yes" : 16'h4e4f);
caryr commented 1 year ago

I'm not sure removing the literal attribute is correct and in this context the string is now three characters wide so dropping the NULL is not correct either. Maybe the display code needs to be updated to account for leading NULLs, of course how do we handle someone inserting a NULL as the first character? There is certainly some subtlety in this bug. The easiest fix is to make "No", " No" (e.i. No) to avoid padding with NULL, but I don't think that is something we can actually do internal to Icarus, but is perfectly okay to do in your code.

@martinwhitaker may have already started looking at the details, but someone is going to need to look at the standard closely to know exactly how this should behave in all contexts. The name brand simulator I use for work does work properly.

larsclausen commented 1 year ago

I've been scrolling through the standard earlier and it doesn't really say what to do with null-bytes in string literals, or at least I could not find it. The only thing it says is that leading null-bytes should be ignored by the %s format specifier. But whether that means something like $display("%s", "\000No") should work is not clear to me. But it doesn't say that it should not work either. Question what should e.g. $display("%x", "\000N\000o"); print? Is this 4 bytes, 2 bytes or 1 byte wide?

caryr commented 1 year ago

If it's not clear then we have to fall back to what other simulators produce. For this case I get "Const Negative No" so the other simulator is treating this as the string "No", but having a width of three so is padding with a space. I need to try the more degenerate cases to see if their implementation is always consistent.

martinwhitaker commented 1 year ago

I tested this in a number of other simulators:

$display(":%x:%s:", "\000a\000b", "\000a\000b");

and got variously

:00610062:ab:
:00610062:    :
:00::
:00610062: a b:
:00610062:a\000b:
:00610062: a b:

Verilog has no concept of a string termination character, so I think the a and b should always be printed. The Verilog standard states that leading zeros are never printed, so I think a space before the a is wrong. What occurs between the a and the b depends on how you choose to handle non-printable characters - I would accept any of ab, a b, or a\000b.

Icarus currently outputs

:::

which I think is completely wrong.

larsclausen commented 1 year ago

I think the space before the a is still right, as long as %s is used due to the signal size based padding. If %0s is used the space should disappear.

From a consistency point of view it makes sense to treat string literals the same as integer literals.

I'd vote for

$display(":%x:%s:%0s:", "\000a\000b", "\000a\000b", "\000a\000b"); // -> ":00610062: a b:a b:"
$display(":%x:%s:%0s:", 32'h00610062,  32'h00610062,  32'h00610062); // -> ":00610062: a b:a b:"
martinwhitaker commented 1 year ago

Yes, I thought about the default field width after writing my last comment. Your suggestion gets my vote too.

caryr commented 1 year ago

I have been thinking about this and I also agree. Make sure to also check %0x which should trim the leading zeros like %0s :610062:

larsclausen commented 1 year ago

Looking at this again. There is nothing in the LRM that null-bytes in the middle should be replaced with a space. But there is a rule for casting vectors to string type.

A string literal assigned to a string variable is converted according to the following steps:

  • All "\0" characters in the string literal are ignored (i.e., removed from the string).

For consistency it makes sense to follow this as well when formatting vectors as strings so that e.g. $display("%s", string'(32h'00610062)) and $display("%s", 32h'00610062) output the same value.

larsclausen commented 1 year ago

Another maybe interesting question is what should the following print?

$display($bits(""));
$display(":%s:%0s"", "", "");

We have one test that assumes :::, but the LRM says "" is the same as 8'h0, so it should be : ::

larsclausen commented 1 year ago

And one more problem. The vlog95 backend strips leading null-bytes in string literals. Which changes the behavior when e.g. passing the value to %s. @caryr you added this, do you remember if there was a specific reason?