twinbasic / lang-design

Language Design for twinBASIC
MIT License
11 stars 1 forks source link

Handling Nulls (and possibly invalid values, too) #11

Open bclothier opened 3 years ago

bclothier commented 3 years ago

Is your feature request related to a problem? Please describe. This is a issue that's dear to me, probably because I work with Access & SQL Server and am always annoyed that I don't have a proper nullable type system to work with and find it clumsy to work with nulls. Variant is the only data type capable of storing a Null value but then it goes on to muddy the water with storing any other data types, which is contrary to how database engines expect their data to be. Therefore, this code is dangerous:

Dim SomeDate As Date

SomeDate = Me.SomeBoundDateControl.Value

The SomeBoundControl is surely bound to a column that can only contain a date. Yet that same column could potentially contain a Null. The Value property of an Access control is a Variant data type. Thus I could do:

Me.SomeBoundDateControl.Value = "Feb 2 1980"
Me.SomeBoundDateControl.Value = 1 'Oh, hi, 1899 called!
Me.SomeBoundDateControl.Value = ""
Me.SomeBoundDateControl.Value = vbEmpty 'That is not the null you're looking for

What utter nonsense! And this also can be a potential runtime error or worse will just coerce to some valid value which is almost never what one wants.

Within Access, we can guard against nulls using Nz() function or IsNull() function:

Dim SomeDate As Date

SomeDate = Nz(Me.SomeBoundDateControl.Value, 0)

But because it's Variant, this becomes woefully inadequate:

Dim SomeDate As Date

SomeDate = Nz(Me.SomeUNBoundControlContainingUserInput.Value, 0

Nothing's stopping the user from typing in an empty string or nonsense like "Feb 31 2021". Therefore, it's common to guard against invalid inputs by something like the following:

For Date data:

If IsDate(...) Then

For String data:

If Len(... & vbNullString) Then

For any integer data (may be too strict):

If Not ... Like "*[!0-9]*" Then

Or (may be too loose)

If IsNumeric(...) Then

All those are horrid. Why must we go through the mental gymnastic of checking that a Variant is a proper data type and otherwise convert it into the desired data type? The default behavior of raising an error is also horrid because now we have error-as-the-control which is also a bad coding pattern to have.

Describe the solution you'd like We need a way of properly constraining the allowable values. As an example, I should be able to do this:

Dim SomeDate As Date?

SomeDate = Me.SomeBoundControl.Value

If IsNull(SomeDate) Then
  'No valid date was entered
Else
  'We have a valid date
End if

The Date? would indicate that it can contain any valid Date value OR Null. Not strings. Not integers. Not floats. Not some hairball from your crazy aunt Bea's even crazier cat.

I'm not sure how this could be implemented. If we want a full compatibility with VB* clients, the tB compiler could emit additional instructions to do a safe coercion of the invalid value into Null so that the Variant will only be of either VT_NULL or VT_DATE in this situation. That results in less cluttered code.

However, that may end up doing too much and possibly violating the principle of least astonishment. Alternatively the tB compiler could emit instruction to raise an error if it cannot fit either VT_NULL or VT_DATE so there's no surprises when garbage input is given and Null is gotten back.

Yet another approach is to treat this as a compile-time inspection only. There would be nothing special for the VB* consumers but for tB code, it would be easy to find suspicious casts and then mark them to indicate that they should only contain a certain data type or Null.

Also, there is a complementary approach to borrow from .NET (mainly C#). We can do this instead:

SomeDate = Forms("SomeForm")?.Controls("SomeControl)")?.Value

The ?. operator indicates that if the object is null, the Value should return null as well, which helps to prevent runtime error when the form SomeForm or the control SomeControl are not open or otherwise available for consumption, thus preventing a runtime error. That eliminates the need for an On Error Resume Next + On Error GoTo 0 block around the dangerous access, which again helps with the code readability.

And there's also this operator:

SomeDate = Me.SomeBoundControl.Value ?? Date()

which indicates that if it's Null (or maybe invalid value, too?), Date() is provided as the default. This is like Access' Nz() function but that wouldn't require everyone to reference the Access object library just to have that functionality.

Those 2 operators can be used in conjunction with the nullable types which would then immensely simplify the process in handling values without having to get into ugly mechanics of handling each data type differently.

mansellan commented 3 years ago

There's also ??=, which assigns the RHS if and only if the LHS is null, and ?[], which provides a value from an indexed collection if and only if the collection is not null (which would be ?() in VB-style syntax).

mwolfe02 commented 3 years ago

Yes, @bclothier . A thousand times, yes. This is the compile-time type safety that is sorely missing from VBA.

I share @mansellan's concerns with bloatage, but this could go a long way towards making it simpler to write data-heavy code (such as in MS Access).

Kr00l commented 3 years ago

How about including the MS Access Nz function into the VBA library of tB ?

Below function should be 100% compatible. (returning Empty if Value is null)

Public Function Nz(ByRef Value As Variant, Optional ByRef ValueIfNull As Variant = Empty) As Variant
    If IsNull(Value) Then Nz = ValueIfNull Else Nz = Value
End Function
WaynePhillipsEA commented 3 years ago

Yes, I think that would be a good addition.

mburns08109 commented 3 years ago

How about including the MS Access Nz function into the VBA library of tB ?

Below function should be 100% compatible. (returning Empty if Value is null)

Public Function Nz(ByRef Value As Variant, Optional ByRef ValueIfNull As Variant = Empty) As Variant
    If IsNull(Value) Then Nz = ValueIfNull Else Nz = Value
End Function

But - in backwards compatibility terms - that would not exactly be 100% backwards compatible with the Access NZ() function at present, would it? I mean, in places where Access's NZ() would throw an error, this function may not throw the same errors, would it?

mansellan commented 1 year ago

I've been thinking some more about this. There are many ways of representing an absence of something in VBx, many of them relate to Variants:

One massive headache here is when it comes to method invocation. If you accept a Variant that you're expecting (but not sure) contains an object, you have two choices:

  1. Try and call the method you're expecting to be there, and handle any error if the Variant contains something else (or nothing at all)
  2. Test for every possible nothingness condition before making the call

Neither are optimal from a boilerplate or performance point of view.

To be clear, this is a problem in many languages. Tony Hoare, the inventor of the null reference, called it his billion dollar mistake (I would argue that, by now and adjusted for inflation, it's many trillions of dollars in 2023).

Describe the solution you'd like Everything @bclothier noted in the OP. Particularly, a set of robust null operators, of which I'd argue the most important is ?.. This would mean: "Do I have an object reference? If so, continue to evaluate this line of code. Otherwise, bail out".

As he stated, that would mean that it's safe to write this:

SomeDate = Forms("SomeForm")?.Controls("SomeControl")?.Value

without checking two different points for failures, or accepting and handling the failure as an error.

mansellan commented 1 year ago

Note, this doesn't protect from a situation where the wrong type of object is supplied. But in practice, that's going to be far less likely, and thus fine to treat as a runtime error. As a C# developer, I see Variant as roughly equivalent to dynamic in C#.