dotnet / vblang

The home for design of the Visual Basic .NET programming language and runtime library.
290 stars 64 forks source link

Boolean literals operations can differ from Boolean variable operations #100

Open Bill-McC opened 7 years ago

Bill-McC commented 7 years ago

Possible bug (or by design?)

        Dim boolFactory As BooleanFactory

        Dim operand1 = boolFactory.GetBool(1)
        Dim operand2 = boolFactory.GetBool(2)

        Dim result As Boolean

        result = operand1  ' result is True
        result = operand2  ' result is True

        result = operand1 = True  ' result is True
        result = operand2 = True  ' result is True

        result = operand1 = operand2 ' result is False 

        result = operand1 And operand2 ' result is False
        result = operand1 And True ' result is True
        result = operand2 And True 'result is False

        result = operand1 AndAlso operand2 'result = True
<StructLayout(LayoutKind.Explicit)>
Public Structure BooleanFactory
    <FieldOffset(0)> Public value As Int32
    <FieldOffset(0)> Public bool As Boolean

    Public Function GetBool(ByVal seed As Int32) As Boolean
        Me.value = seed
        Return Me.bool
    End Function
End Structure
rskar-git commented 7 years ago

Looks to me that (generally reasonable) shortcuts were made in regards to Boolean. That's the problem with unions (as effected by the <FieldOffset(0)>), you're likely to get these sort of crazy edge cases.

Here's my naive interpretation of things:

(1) A "valid" True is represented by having the least significant bit set to 1 and all other bits (as they may be) set to 0 (False all 0's, of course).

(2) However, sometimes non-zero is as good as True (an "invalid" True) for some operations, for sake of convenience or performance.

(3) Hence a Quickwatch on result may report as True even for an "invalid" True. That may depend on how Boolean.ToString() is actually implemented.

(4) result = operand2 = True might seem to work if the compiler optimizes on x = True and x = False situations, and simply tests on whether x is non-zero or zero.

(5) result = operand1 = operand2: I wonder if this became either: (1) a Structure to Structure equality test, where True only if every corresponding bit matches; or (2) more likely, an Integer equality test, which works for all "valid" True and False. A more generalized equality test also good for "invalid" True would probably be a little more compute intensive.

(6) result = operand1 And operand2: Interesting, but not surprising, if a bit-wise And was done Integer-style. Comes up False because operand1 has an image of an Integer of 1; and operand2, image of 2. I'd guess a more generalized And for Boolean also good for "invalid" True would probably be a little more compute intensive.

(7) result = operand1 And True and result = operand2 And True: Huh, I guess True simply is transformed to a "valid" True, and then an Integer-style bit-wise And gets done.

(8) result = operand1 AndAlso operand2: Since short-circuit logic involves jumps, conditional branches on either zero or non-zero are likely done. So not surprised that "invalid" True works here.

For fun, I did the C# equivalent, with surprising results:

BooleanFactory boolFactory = new BooleanFactory();
var operand1 = boolFactory.GetBool(1);
var operand2 = boolFactory.GetBool(2);
bool result;

result = operand1;              // result is True
result = operand2;              // result is True

result = operand1 == true;      // result is True
result = operand2 == true;      // result is True

result = operand1 == operand2;  // result is False 

result = operand1 & operand2;   // result is False
result = operand1 & true;       // result is True
result = operand2 & true;       // result is False in VB, but True in C#!

result = operand1 && operand2;  // result = True in VB, but False in C#!

[StructLayout(LayoutKind.Explicit)]
public struct BooleanFactory
{
    [FieldOffset(0)] Int32 value;
    [FieldOffset(0)] bool bool_;

    public bool GetBool(Int32 seed)
    {
        this.value = seed;
        return this.bool_;
    }

}

On the basis of the above, maybe a compiler optimization for x And True and x And False situations may be helpful here?

OTOH, maybe you shouldn't be doing what you're doing with this BooleanFactory thingy.

Bill-McC commented 7 years ago

Absolutely shouldn't be doing what I am with the BooleanFactory, but neither compiler or runtime is stopping me ;)

re : (1) So we have introduced a concept of "valid" or invalid Booleans (is that a quantum Boolean ?) .

(2) & (3). Yes, typically comparison is done to zero (aka False), and Boolean's ToString compares to zero returning "False" otherwise "True". No concept of invalid.

(4) Yes, compiler will optimize equality with True to non equality with zero. n.b: it only does this with the literal True. So a If bool = True Then is compiled to equality test with zero, equivalent of If bool <> False Then

(5) compiler just does a c.eq on the values on the stack which is an integer equality test

(6) again compiler does bitwise And on the integer values on the stack. Not sure of the point of bitwise And's on Booleans

(7) & (8) Yep.

Re C# code. The & operator in C# is defined as doing a logical and if both operands are Booleans. I would have expected Vb's And to do the same. I think there the C# code is correct. But the C# && result seems wrong to me. I would have expected that to be the same as doing the two comparisons to True.

Nukepayload2 commented 7 years ago

It's not a bug. Because Boolean is not a blittable type.

https://docs.microsoft.com/en-us/dotnet/framework/interop/blittable-and-non-blittable-types

Convert an Int32 value to a Boolean value with FieldOffsetAttribute or C# unsafe pointers may produce a corrupted Boolean value.

The internal implementation of the Boolean structure maybe like this:

Public Structure BclBoolean
    Private value As Byte, alignment1 As Byte, alignment2 As UShort

    Public Shared Operator =(value1 As BclBoolean, value2 As BclBoolean) As Boolean
        Return value1.value = value2.value
    End Operator

    Public Shared Operator <>(value1 As BclBoolean, value2 As BclBoolean) As Boolean
        Return value1.value <> value2.value
    End Operator

    Public Shared ReadOnly [False] As BclBoolean
    Public Shared ReadOnly [True] As New BclBoolean With {.value = 1}

    Public Overrides Function ToString() As String
        Return If(value = 0, "False", "True")
    End Function
End Structure
rskar-git commented 7 years ago

@Nukepayload2, thanks for the blittable reference!

@Bill-McC, if you check out in that reference the conversion done for non-blittable Boolean:

System.Boolean: Converts to a 1, 2, or 4-byte value with true as 1 or -1.

If you'd rather, we could at least say there is an Official True (the technically true True, the best kind of True!), as evidenced by the above reference; and I'll say nothing more about "valid"/"invalid" whatever - no need to get all quantum here.

Why bit-wise And for Boolean? I'm certain bit-wise is super easy and fast for the CPU to do. Done that way for performance.

"The & operator in C# is defined as doing a logical and if both operands are Booleans." No, not really. It got result = operand1 & operand2; wrong too (or do you accept False as the "true" answer? :).

Bill-McC commented 7 years ago

Not sure I get the relevance of blittable or non blittable types: to me that's to do with interop conversions, hence the Boolean is converted to -1 for COM/OLE interop, otherwise 1. Similarly strings are converted to BStr's or null terminated strings. When we work with strings, we don't need to check if it is null terminated to avoid buffer overrun in .NET, likewise we shouldn't be worrying about whether or not a Boolean's value is other than 0. With strings in fact we can have null characters.

Yes, the C#, result = operand1 & operand2; should be returning true. It got the & true optimisations correct, and did a logical and as per the language spec, but failed with two variables. The VB And True is particularly bad as not only does it do a bitwise, but does a bitwise with 1.

There's definitely an issue of optimisation, but if that is the case then the language spec(s) should reflect that, not claim a logical operation is being done when a bitwise one is.