AssemblyScript / assemblyscript

A TypeScript-like language for WebAssembly.
https://www.assemblyscript.org
Apache License 2.0
16.86k stars 657 forks source link

improving nullability flow checking #1973

Open trusktr opened 3 years ago

trusktr commented 3 years ago

From https://github.com/AssemblyScript/assemblyscript/issues/1972#issuecomment-878599169

dcodeIO commented 3 years ago

Can you provide some context? In particular, where would this be safe, since without a runtime check we'd lose type safety and as such would introduce potential hazards?

jtenner commented 3 years ago

Hey does this already work with --no-assert?

Context: I've already done a pointer check and don't need to perform the actual assertion

trusktr commented 3 years ago

@willemneal What was the idea here? Maybe it is not possible to remove the runtime aspect?

willemneal commented 3 years ago

Consider:

function stringOrNull(s: string | null): string | null { return s; }

let s = stringOrNull("hello")!; // The type system should know this is non-null, but the `!` confirms it.
let n = stringOrNull(null)!; // Not safe

With the proposed option for ! the first case above it is safe, while the second isn't. However, this is also true for typescript. The idea here is that by default ! performs a runtime check to ensure that it's safe, but in the cases where the author knows better than the compiler it can help

As for @jtenner's point, that would work, but it might be the case that the author wants bounds checks or other assertions but not this particular one.

dcodeIO commented 3 years ago

The difference to TS here is that if an author is wrong, they won't be greeted with a "Cannot do X on null" runtime error but with a potential hazard corrupting memory, leading to undefined behavior, a broken runtime or silent security issues.

willemneal commented 3 years ago

True, but it's not the default semantics and the author is responsible.

And it is possible currently to make the following function to achieve the same result.

@inline
function isNotNull<T>(t: T): NonNullable<T> { return t; }

let s = isNotNull(stringOrNull("hello")); 
// or
let s = <string>stringOrNull("hello");

So really the change is a semantic sugar of the above approaches.

dcodeIO commented 3 years ago

Wait, are you saying that the use of NonNullable<T> is unsafe?

Edit: No, it's not, that results in

  ERROR TS2322: Type '~lib/string/String | null' is not assignable to type '~lib/string/String'.

   function isNotNull<T>(t: T): NonNullable<T> { return t; }

and the second inserts a runtime check.

willemneal commented 3 years ago

Sorry it's

   function isNotNull<T>(t: T): NonNullable<T> { return <NonNullable<T>>t; }

and I updated my previous comment to remove the !.

And it's only unsafe if you use it in a situation where it could be null.

dcodeIO commented 3 years ago

That's similar to <string>stringOrNull("hello"). The cast there inserts a runtime check as well to be safe.

willemneal commented 3 years ago

Oh I didn't know that casting resulted in a runtime check.

Here is the NonNullable in safe place to allow for the next generic call to not have the null value in the type: https://github.com/gagdiez/serial-as/blob/4a397fdd5e1bc6d303136f03912641cf00254d21/borsh/assembly/serializer.ts#L37

dcodeIO commented 3 years ago

In this case the compiler knows that t is never null and can omit the runtime check.

willemneal commented 3 years ago

Technically it shouldn't check in the first example since it should narrow the return type.

dcodeIO commented 3 years ago

TS doesn't do that, though:

function stringOrNull(s: string | null): string | null { return s; }
stringOrNull("asd").length; // Object is possibly 'null'.(2531)
willemneal commented 3 years ago

Ah shoot but it should!

I hate null and always have. Perhaps we should look into multivalue return for a standard Option type.

jtenner commented 3 years ago

Yeah once multivalues are supported, Option would be amazing

MaxGraey commented 3 years ago

With option you got exactly the same. Just instead str! you will got str.unwrap(), instead str || "" you will got str.unwrap_or("") and etc.

MaxGraey commented 3 years ago

Ah shoot but it should!

How? Same result in Rust:

fn string_or_null(s: Option<&str>) -> Option<&str> { 
  s
}

fn main() {
    let len = string_or_null(Some("hello")).len();
}

output:

  |
6 |     let len = string_or_null(Some("hello")).len();
  |                                             ^^^ method not found in `Option<&str>`

So you should do something like this:

let len = string_or_null(Some("hello")).unwrap_or("").len();

I don't understand how monadic option can help with this?

willemneal commented 3 years ago

There is no null! And it will work for any type. With multivalue we can also handle Result types so that str.unwrap would return the error. This style of error handling is great for AS since we don't have exceptions.

MaxGraey commented 3 years ago

There is no null! And it will work for any type. With multivalue we can also handle Result types so that str.unwrap would return the error. This style of error handling is great for AS since we don't have exceptions.

No, monadic Result and Option useful only in languages where it really builtin which also has pattern matching.

I see real solution for this. It's Hindley Milner type system which uses in functional languages. In this case the solution quite clear:

function stringOrNull(s) { return s; }

stringOrNull("someStr").length; // ok due to call site infers as stringOrNull<string>(s: string)

let str: string | null = null;
stringOrNull(str).length; // Compile Error due to call site infers as stringOrNull<string | null>(s: string | null)

See this example in Hegel for example

willemneal commented 3 years ago

No, monadic Result and Option useful only in languages where it really builtin which also has pattern matching.

That's a bold statement.

MaxGraey commented 3 years ago

No, monadic Result and Option useful only in languages where it really builtin which also has pattern matching.

That's a bold statement.

Just compare Maybe in Haskell with Rust's Option: https://www.futurelearn.com/info/courses/functional-programming-haskell/0/steps/27247

Even Ocaml less verbose than Rust: https://ocaml.org/api/Option.html

willemneal commented 3 years ago

It'd still say it's useful even if it's more verbose.

MaxGraey commented 3 years ago

It'd still say it's useful even if it's more verbose.

Why? Because it's done in Rust?

MaxGraey commented 3 years ago

Monadic Option and Result is very important concepts in pure functional languages because it's only one approach to handle and isolate side effects. But Rust is not Pure functional programming and even haven't builtin Monads. In other hand it haven't exceptions and nullability due to it's not fit into their paradigm. In my opinion monadic Option / Result in Rust look very unnatural and therefore verbose

willemneal commented 3 years ago

Rust also lets you do stringOrNull(str)?.length, which returns an result with an error.

MaxGraey commented 3 years ago

Rust also lets you do stringOrNull(str)?.length, which returns an result with an error.

TypeScript also support this

willemneal commented 3 years ago

True, but it has exceptions so we'd need to either support that or use a result type.

trusktr commented 3 years ago

True, but it has exceptions so we'd need to either support that or use a result type.

Or just replace anything that would return undefined with null and it would work in most cases. like map.get('key that does not exist') returns null instead of runtime error.

I think I would be okay with null instead of errors where otherwise TS would have undefined. In my own code JS/TS, I always check !foo or foo == null or foo != null, and hence in these scenarios, the difference between undefined and null never matters. I think writing code any other way is asking for hazards.

So if we turns all undefined results into null, I think it would be good.


All of the following examples all work in TypeScript, but I have labeled them with ERROR or OK to show which ones work or don't work in AssemblyScript:

// All these should have the same final result as in TypeScript.
for (let node: Node | null = el.childNodes[0]!.firstChild; node; node = node.nextSibling) i++ // COMPILE ERROR
for (let node: Node | null = el.childNodes[0]!.firstChild; node; node = node!.nextSibling) i++ // OK
for (let node: Node | null = el.childNodes[0]!.firstChild; node!; node = node.nextSibling) i++ // COMPILE ERROR
for (let node: Node | null = el.childNodes[0]!.firstChild; node!; node = node!.nextSibling) i++ // RUNTIME ERROR
for (let node: Node | null = el.childNodes[0]!.firstChild; node != null; node = node.nextSibling) i++ // COMPILE ERROR
for (let node: Node | null = el.childNodes[0]!.firstChild; node != null; node = node!.nextSibling) i++ // OK
//
// These while-loop variants should all work the same too.
let node: Node | null = el.childNodes[0]!.firstChild
while (node) { node = node.nextSibling; i++ } // COMPILE ERROR
while (node) { node = node!.nextSibling; i++ } // OK
while (node!) { node = node.nextSibling; i++ } // COMPILE ERROR
while (node!) { node = node!.nextSibling; i++ } // RUNTIME ERROR
while (node != null) { node = node.nextSibling; i++ } // COMPILE ERROR
while (node != null) { node = node!.nextSibling; i++ } // OK

where firstChild and nextSibling are nullable types (Node | null).

trusktr commented 3 years ago

See this example in Hegel for example

Yeah, Hegel is really nice in this regard. @JSMonk Can we convince you to join AssemblyScript efforts? :smiley:

willemneal commented 3 years ago

Or just replace anything that would return undefined with null and it would work in most cases. like map.get('key that does not exist') returns null instead of runtime error.

Except for primitives.

trusktr commented 3 years ago

I forgot about that. That would be inconsistent (f.e. a Map with primitive values throws while a Map with object values returns nulls), but maybe it would be ok?

Would making primitives nullable incur runtime overhead (wrapping the actual primitives in nullable wrappers)? Or maybe it wouldn't have overhead if the type checker gains nullable awareness like Hegel's?

MaxGraey commented 3 years ago

See this example in Hegel for example

Yeah, Hegel is really nice in this regard. @JSMonk Can we convince you to join AssemblyScript efforts? 😃

Yes, we have been in the process of discussing the collaboration with @JSMonk for some time now. Stay tuned!

willemneal commented 3 years ago

If a primitive doesn't exist in a map then the type checker can't help, since it can't be known until runtime. What I'm suggesting with the multivalue would solve it. You would get back something like (value, isNull), which would allow all types to be nullable.

MaxGraey commented 3 years ago

According exceptions vs Result monad. When you need return several error types simultaneously Result is really uncomfortable, non-performant (Box<dyn error::Error>) or verbose (a lot of boilerplates) in this case: https://doc.rust-lang.org/rust-by-example/error/multiple_error_types.html

willemneal commented 3 years ago

? in TS only returns undefined, not return early.

function foo(val?: any): string? {
   let x = val?.func();
  return x?.charAt(0);
}

If val isn't defined or null, then x becomes undefined.

Whereas if it were used as in rust the function

function foo(val: Option<any>): Result<string, Error> {
   let x = val?.func();
  return x.charAt(0);
}

would return with the error if val was None.

So the semantics compared to TS are different. And I understand the difficulties in multiple error types, but it still seems the quickest path we have for error handling.

Perhaps the topic of error handling deserves a new issue or discussion.

dcodeIO commented 3 years ago

May I suggest to open a new issue, with concrete actionable items in its description, and a clear title?

Perhaps the topic of error handling deserves a new issue or discussion.

No, it needs exception handling, not Rust primitives.