Closed bpringlemeir closed 1 year ago
Note that version 10.3 is very old and no longer supported. But this bug is still present in the master branch.
I don't need to have it for 10.3, I have a work around (assign the constant to a register; ok for a test bench). I was just reporting in case it was still present.
I reproduced with 22.04 version 11.0 release on stock Linux. I recompile with 12 development master and reproduced.
The VVP files all have,
%vpi_call/w 2 5 "$display", "Const Negative %s", "\000NO" {0 0 0};
Is it something to do with the parsing of "NO" or why would a NUL character be added by an optimization pass?
If I edit the file and remove the '\000', then "NO" is printed. It is not a VVP issue; but probably you knew that.
The rules for ternary operators are that both arguments are expanded to the width of the widest argument. In the non-optimized case a temporary variable that is a vector gets generated. E.g.
reg [23:0] v;
if (expected) v = "Yes"; else v = "No";
$display(..., v);
When assigning a string literal to a vector that is wider it gets left padded with 0s. $display then ignores those extra 0. This is the correct behavior.
The confusion now happens with the optimization pass that removes the ternary operator and replaces it with a constant. Then the code generation pass thinks it is still a string literal and generates a string with the leading '\000', which the cases the string to be ignored.
I'm not 100% sure what the right fix is, but maybe best is to remove the string literal flag from a constant when it is width expanded. Since e.g. the following works as expected.
$display("Const Negative %s", 0 ? "yes" : 16'h4e4f);
I'm not sure removing the literal attribute is correct and in this context the string is now three characters wide so dropping the NULL is not correct either. Maybe the display code needs to be updated to account for leading NULLs, of course how do we handle someone inserting a NULL as the first character? There is certainly some subtlety in this bug. The easiest fix is to make "No", " No" (e.i.
@martinwhitaker may have already started looking at the details, but someone is going to need to look at the standard closely to know exactly how this should behave in all contexts. The name brand simulator I use for work does work properly.
I've been scrolling through the standard earlier and it doesn't really say what to do with null-bytes in string literals, or at least I could not find it. The only thing it says is that leading null-bytes should be ignored by the %s format specifier. But whether that means something like $display("%s", "\000No")
should work is not clear to me. But it doesn't say that it should not work either. Question what should e.g. $display("%x", "\000N\000o");
print? Is this 4 bytes, 2 bytes or 1 byte wide?
If it's not clear then we have to fall back to what other simulators produce. For this case I get "Const Negative No" so the other simulator is treating this as the string "No", but having a width of three so is padding with a space. I need to try the more degenerate cases to see if their implementation is always consistent.
I tested this in a number of other simulators:
$display(":%x:%s:", "\000a\000b", "\000a\000b");
and got variously
:00610062:ab:
:00610062: :
:00::
:00610062: a b:
:00610062:a\000b:
:00610062: a b:
Verilog has no concept of a string termination character, so I think the a
and b
should always be printed. The Verilog standard states that leading zeros are never printed, so I think a space before the a
is wrong. What occurs between the a
and the b
depends on how you choose to handle non-printable characters - I would accept any of ab
, a b
, or a\000b
.
Icarus currently outputs
:::
which I think is completely wrong.
I think the space before the a is still right, as long as %s
is used due to the signal size based padding. If %0s
is used the space should disappear.
From a consistency point of view it makes sense to treat string literals the same as integer literals.
I'd vote for
%0s
%s
output with spaces to the width of the constant%s
or %0s
with spaces$display(":%x:%s:%0s:", "\000a\000b", "\000a\000b", "\000a\000b"); // -> ":00610062: a b:a b:"
$display(":%x:%s:%0s:", 32'h00610062, 32'h00610062, 32'h00610062); // -> ":00610062: a b:a b:"
Yes, I thought about the default field width after writing my last comment. Your suggestion gets my vote too.
I have been thinking about this and I also agree. Make sure to also check %0x which should trim the leading zeros like %0s :610062:
Looking at this again. There is nothing in the LRM that null-bytes in the middle should be replaced with a space. But there is a rule for casting vectors to string type.
A string literal assigned to a string variable is converted according to the following steps:
- All "\0" characters in the string literal are ignored (i.e., removed from the string).
For consistency it makes sense to follow this as well when formatting vectors as strings so that e.g.
$display("%s", string'(32h'00610062))
and $display("%s", 32h'00610062)
output the same value.
Another maybe interesting question is what should the following print?
$display($bits(""));
$display(":%s:%0s"", "", "");
We have one test that assumes :::
, but the LRM says "" is the same as 8'h0, so it should be : ::
And one more problem. The vlog95 backend strips leading null-bytes in string literals. Which changes the behavior when e.g. passing the value to %s
. @caryr you added this, do you remember if there was a specific reason?
Below is a test case.
I am using the WSL2 with Ubuntu 20.04 release of the iverilog and vvp.
The Const Negative line does not supply the string 'NO'. If the constants are verilog defines, then the use case becomes more apparent.