asoffer / Icarus

An experimental general-purpose programming language
Apache License 2.0
9 stars 2 forks source link

Values yielded from scopes are not type-checked correctly #84

Open perimosocordiae opened 2 years ago

perimosocordiae commented 2 years ago

Minimal example:

-- ::= import "core.ic"
for (0, 5 as u32) do [i: i64] {}

Error message (when built with -c dbg):

[140270949840712 ir/builder.cc:382 ApplyImplicitCasts] Unreachable code-path.
(u32, casting implicitly to, [](i64))
*** SIGABRT received at time=1638412917 on cpu 7 ***
PC: @     0x7f93602d2808  (unknown)  pthread_kill
    @          0x192c980         64  absl::WriteFailureInfo()
    @          0x192c664        224  absl::AbslFailureSignalHandler()
    @     0x7f936027e520  (unknown)  (unknown)
Aborted (core dumped)

It looks like we're trying to cast the u32 value to a slice of i64 for some reason? Very strange.

Also: any ideas why the stacktrace provided by absl is so awful here?

perimosocordiae commented 2 years ago

On the other hand, this code works when I didn't expect it to:

-- ::= import "core.ic"
io ::= import "io.ic"

for (125, 130) do [i: i8] {
  io.Print(i, !\n)
}

Output:

125
126
127
-128
-127
perimosocordiae commented 2 years ago

One more fun error with scope shenanigans:

-- ::= import "core.ic"
for (0, 5) do [i: *i8] {}

Error message:

[140398500611912 compiler/emit/scope_node.cc:231 EmitToBuffer] Not yet implemented.
(*(i8))
*** SIGABRT received at time=1638413290 on cpu 1 ***
PC: @     0x7fb112cb7808  (unknown)  pthread_kill
    @          0x192c980         64  absl::WriteFailureInfo()
    @          0x192c664        224  absl::AbslFailureSignalHandler()
    @     0x7fb112c63520  (unknown)  (unknown)
Aborted (core dumped)
asoffer commented 2 years ago

Thanks for finding these. The crashes are obviously bugs, but the non-crash is too. The cast of the integer 130 to i8 should be a compile-time error. As currently implemented, the range type backing the for scope holds two i64s. We shouldn't allow the do block to operate on anything other than an i64. The casts here are almost certainly invoking UB.

perimosocordiae commented 2 years ago

One more for the pile. File name is "repro.ic":

file ::= import "file.ic"
file.With("repro.ic") open [f: *file.File] {}

Error message:

[140543188440904 compiler/emit/scope_node.cc:231 EmitToBuffer] Not yet implemented.
(*(struct.47190064))
*** SIGABRT received at time=1638551694 on cpu 3 ***
PC: @     0x7fd2c2dc8808  (unknown)  pthread_kill
    @          0x1933800         64  absl::WriteFailureInfo()
    @          0x19334e4        224  absl::AbslFailureSignalHandler()
    @     0x7fd2c2d74520  (unknown)  (unknown)

The correct scope state is a plain file.File, not a pointer to a File. This should be caught at type verification time, similar to the i8 example above.

perimosocordiae commented 2 years ago

Update now that scopes have been overhauled. The first code snippet produces a nice error message:

Error:
Expression cannot be called with the given arguments.

  2 | for (0, 5 as u32) do [i: i64] {}
  * scope (low: i64, high: i64) -- Parameter at index 1 cannot accept an argument of type `u32`

The second snippet (where we use i8 as the yielded value) still has the same behavior where the underlying i64 gets unsafely casted.

The third and fourth snippets (with [i: *i8] and [f: *file.File]) now run without any visible errors. This is really strange. I'm going to rename this issue to reflect the current state of things a bit better.

asoffer commented 2 years ago

The i8 casting behavior is still a bug. The pointer snippets are technically correct right now, because we allow implicit casts from T to *T. This feels strange when thinking of *T as a C/C++-like pointer, but if we think of them more like references, this is entirely reasonable. It's just "pass by reference" semantics.

The more I explore scopes, the more I'm realizing that they're a fancy way to pass closures to another function (they're actually a bit more powerful than that). And so it's reasonable to think of a block itself as some sort of callable that gets passed into a scope and then invoked by the scope. The invoking scope doesn't need to care about the precise types of the parameters but only that the arguments it passes can bind. This allows us to do things like having generic blocks or having multiple blocks with the same name but different parameters (effectively an overload set). Imagine a variant visitation API:

visit (my_variant) as [n: i64] { io.Print("Found an integer") }
                   as [x: f64] { io.Print("Found a floating-point number") }
                   as [x: ~`T] { io.Print("Found something else") }

All said, I think this would feel a lot more familiar if instead of *, we used &. I don't know whether or not we should actually do that, but if you imagine reading it as for (0, 10) do [n: &i64] { ... }, this will probably feel a lot more reasonable.

perimosocordiae commented 2 years ago

I suppose the larger design question is whether we want to distinguish between nullable and non-nullable references. It does feel a bit strange to use *T for both.