Open svick opened 1 year ago
@svick I assume that branchless IL would be generally faster but I don't know how strong this assumption is. It would be helpful if we had specific benchmark data points for the proposed optimizations.
Tagging @jjonescz in case he's interested to pick this up (was doing other similar optimizations recently).
Note the value stored in bool? is not guaranteed to be 1,0 or null, but comparisons in C# must produce 1 or 0, I think.
comparisons in C# must produce 1 or 0
@AlekseyTs they definitely don't today. For example b == true
is lowered to just b
. And so this results in 2
:
Unsafe.BitCast<bool, byte>(Unsafe.BitCast<byte, bool>(2) == true)
The standard says for example that bool
equality operator produces true
/false
[12.12.5 Boolean equality operators]. Where false
is specified as 0 and true
is different than false
[8.3.9 The Bool type].
they definitely don't today. For example b == true is lowered to just b. And so this results in 2
An example doesn't really make it right. What is the value used by the language for true
literal? In my mind this is the value of true
. I think this should be brought to LDM for a decision.
What is the value used by the language for
true
literal?
The standard says its unspecified but different than 0
[8.3.9 The Bool type]. That looks quite clear to me that it can be any number.
That looks quite clear to me that it can be any number.
"Unspecified" and "any number" are not the same thing. My interpretation, it is a specific number, but its value is implementation dependent. The only thing we can definitely say, that the underlying value of true
is not 0.
This, for example, prints 0 and 0.
bool b = true;
System.Console.WriteLine(Unsafe.BitCast<bool, byte>(Unsafe.BitCast<byte, bool>(2) == b));
b = false;
System.Console.WriteLine(Unsafe.BitCast<bool, byte>(Unsafe.BitCast<byte, bool>(2) == b));
I will not be surprised if LDM decides that it doesn't care about scenarios where a boolean underlying value doesn't match true
or false
. However, from my point of view, language specification doesn't explicitly say that.
This, for example, prints 0 and 0.
Interesting.
So I see these options:
bool
s that are not represented as 1 or 0. If we get them, the behavior is undefined. Hence bool b = true; Unsafe.BitCast<byte, bool>(2) == b
is false
but Unsafe.BitCast<byte, bool>(2) == true
is true
and that's fine. Probably need to update the spec to say that.
bool
outputs are always 1 or 0. For example, b == true
could not be lowered to b
. Seems weird if we allow the inconsistency described above to then try hard to ensure the output is represented in bounds.bool
outputs can be anything. Then existing behavior and the proposed codegen optimization is fine.bool
s that are not represented as 1 or 0. That would require more IL to be emitted for existing scenarios, I think. For example, b1 == b2
probably cannot simply use ceq
to compare those booleans anymore.
bool
outputs are well-defined true
or false
, where true
can be any byte except 0. We can keep doing some optimizations like lowering b == true
to b
. This seems to fit the spec as written today.bool
outputs are always 1 or 0.The history and rationale on how we handle true
values that aren't exactly true
is long. Probably one of the more detailed discussions is in #24652. That is ~5 years old now but a lot of the rationale still holds. The TLDR of that is that C# cares about the bool
values true
(whatever that value is) and false
(defined as zero). Every other bool
value results in undefined behavior.
"Unspecified" and "any number" are not the same thing. My interpretation, it is a specific number, but its value is implementation dependent. The only thing we can definitely say, that the underlying value of true is not 0.
I agree this is the correct interpretation of the spec. At the same time the spec seems needlessly vague here. If the numeric value for true
is anything other than 1
then it would massively subvert the expectations of the ecosystem. The compiler docs already state that true
must be 1
(in retrospec though at the time we wrote that doc we should've changed the actual spec).
This area is difficult enough to navigate at times and having this vague definition for true
isn't making it any easier. Think we should follow up with LDM here and get the spec changed to say true
is 1
.
I will not be surprised if LDM decides that it doesn't care about scenarios where a boolean underlying value doesn't match true or false.
I agree. Historically LDM has let the compiler break behavior for undefined values of bool
. This came up quite a bit during pattern matching. In part because exhaustiveness rules were changed to only recognize true
and false
. That resulted in breaking changes that we accepted. Pattern matching also changed how we code generated certain comparisons and that broke behavior for undefined values of bool
(having trouble finding the exact bugs at the moment but remember this happening) and we were comfortable with the breaks.
Think there are two actions we need from LDM:
true
is 1
... cause any other value doesn't really workbool
. Think we should follow up with LDM here and get the spec changed to say true is 1.
Yes please.
The behavior of null conditional operator (?.
) and methods that return bool values must also be considered. For example:
static class C
{
static bool M1(object o) => nb?.F() == true;
static bool M1b(object o) => nb?.F() is true;
static bool F(this object o) => true;
}
While == true
seems to generate optimal IL, is true
comparison results in IL that unnecessarily involves a bool? conversion.
Currently, the IL generated for operations on a
bool? nb;
likenb == true
andnb is true
is suboptimal. And since the runtime does not seem to understand the semantics ofNullable<T>
, this results in suboptimal machine code being executed.Namely:
nb == true
generates branchless IL (see https://github.com/dotnet/roslyn/issues/17261) equivalent tonb.GetValueOrDefault() == true & nb.HasValue
, when it could be justnb.GetValueOrDefault()
.nb is true
generates branched IL equivalent tonb.HasValue ? nb.GetValueOrDefault() : false
, when it could also be justnb.GetValueOrDefault()
.nb is false
generates branched IL equivalent tonb.HasValue ? !nb.GetValueOrDefault() : false
when it could be the branchless!nb.GetValueOrDefault() & nb.HasValue
.nb ?? true
generates branched IL equivalent tonb.HasValue ? nb.GetValueOrDefault() : true
, when it could be the branchlessnb.GetValueOrDefault() | !nb.HasValue
.Note that
nb ?? false
already produces the optimal code ofnb.GetValueOrDefault()
.Also note that the suggested code does not contain comparisons with
true
, which also result in unnecessary IL and machine code.SharpLab link.