asoffer / Icarus

An experimental general-purpose programming language
Apache License 2.0
9 stars 2 forks source link

Overloaded indexing doesn't support mutation #98

Open perimosocordiae opened 2 years ago

perimosocordiae commented 2 years ago

Now that we can use __index__ for user types, it seems natural to also support indexed assignments as well.

foo ::= struct {
  x: i64
}

__index__ ::= (self: *foo, idx: i64) => self.x

f := foo.{ x = 3 }
f[0]
f[0] = 1

Results in the following error:

Error in repro.ic:
Assigning to a non-reference expression:

  10 | f[0] = 1

[139632588666696 compiler/emit/index.cc:112 EmitRef] Unreachable code-path.
(struct.57625264)
*** SIGABRT received at time=1639544198 on cpu 0 ***
PC: @     0x7efebee2a808  (unknown)  pthread_kill
    @          0x1923ef0         64  absl::WriteFailureInfo()
    @          0x1923bd4        224  absl::AbslFailureSignalHandler()
    @     0x7efebedd6520  (unknown)  (unknown)
Aborted (core dumped)

I also tried changing __index__ to return a pointer to i64, but that resulted in one additional error along with the results above:

Error in repro.ic:
Cannot assign a value of type `integer` to a reference of type `*(i64)`:

  10 | f[0] = 1
asoffer commented 2 years ago

Agreed. The underlying issue is that the language doesn't have references, only pointers.

asoffer commented 2 years ago

I wonder if it's as simple as this: Assignments take a left-hand side of type *T and right-hand side of type T.

Because we already have implicit conversions from T to *T, for primitive types, something like:

n: i64
n = 3

Would be correctly interpreted as implicitly casting the left-hand side to the address of n, and assigning 3 into that address`.

When I describe this to myself in terms of pointers, it feels very strange. But rethinking this in terms of references (here I'm using the C++ terminology of "pointers" and "references") this seems to make a lot more sense. Effectively in C++ a user-defined operator= is a function accepting a pointer/reference to T and a T, and we rely on the implicit conversion from T to reference-to-T (when the value category matches).

I'm not sure if there's something about the syntax we're using for pointers that makes this feel weirder, or it's just my bias coming from C++. I'm also not entirely sure this approach is sound, though I haven't found any holes in it yet.

Thoughts?

wrhall commented 2 years ago

What happens if the lhs is const or doesn't have a reference type that can be assigned to? (What are examples of such a thing?)

What happens if you try to take a reference of an expression?

Eg

2+2 = 5
asoffer commented 2 years ago

The implicit cast to reference works the same as C++: the compiler tracks the value category and the cast is only allowed for lvalues.

There is still an issue where we want references to implicitly dereference so that c := s[0] copies the strings first character. Or maybe that doesn't always happen? C++ always dereferences. I believe Rust doesn't. Rusts approach is certainly more confusing at first glance (the rules are objectively more complex) but the borrow checker alleviates this. So maybe in c := s[0] it is a reference to a character, but the borrow checker will let you know if you accidentally assign through c.

Also strange: pointers/references can be rebound with this approach via &p = new_value

perimosocordiae commented 2 years ago

Ideally we'd have a way to trigger user-defined logic when index-assigning. The example that comes to mind is a sparse matrix type, where you might need to do some non-trivial work on assignment:

__index__ ::= (self: *sparse_array, idx: i64) -> f64 {
  if (self'in_sparsity_pattern(idx)) {
    return self'explicit_value(idx)
  }
  return 0
}

__index_assign__ ::= (self: *sparse_array, idx: i64, val: f64) -> () {
  if (self'in_sparsity_pattern(idx)) {
    self'set_explicit_value(idx, val)
  } else {
    self'update_sparsity(idx, val)
  }
}