jfecher / ante

A safe, easy systems language
http://antelang.org
MIT License
1.9k stars 79 forks source link

Pointer projections #106

Closed skaller closed 2 years ago

skaller commented 2 years ago

If I understand correctly from an extremely brief reading, if x is a struct with field f then f x = x.f so that f acts as a value projection. If you have a pointer px to x then f px = px.f also fetches the value, and I guess similarly to C++ operator-> this applies recursively. But you also need what I call pointer projections so that px &. f is a pointer to f (for some operator &.) and now you lose the invariant f x = x.f. In Felix, that invariant is enforced globally in the parser, and px.f = f px is a pointer to the f component. This obeys the commutative diagram you expect, so deref (px.f) = (deref px).f noting that the two f are distinct overloads: the first is a pointer projection and the second a value projection (and I left out the other sides of the square which do address taking). The real value of this is that since projections are functions, they can be composed, so you can drill down into nested products using pointer projections and deref at the end, which is very much more efficient than successive value projections if your product is addressable.

jfecher commented 2 years ago

Ante already has what you call pointer projections, the operator is .& versus the normal .. Given foo: ref { bar: Bar, .. } then we have foo.bar : Bar and foo.&bar : ref Bar.

This is subject to change in the future though. Perhaps it should be the case . is polymorphic to support the syntax familiar to most users foo.bar := foo2.bar without .&.

skaller commented 2 years ago

Sometimes you have to be regular rather than familiar, since familiar things may come from old languages that are broken. In Felix the invariant f x == x . f is enforced by the parser; that is, . is not an operator at all. It's just syntactic sugar. This has interesting consequences. For a tuples: (a,b) . 0 is sugar for applying the 0'th projection so is, unsurprisingly, the value a. But since . is just sugar you can write this like 0 (a,b) and it works too. The actual projection is proj 0 of (A * B) where a:A and b:B, which is a first class function A * B -> A.

I have an intrinsic _deref and a library function deref and the parser maps unary * to deref. Also, there is no assignment operation. If you write a = b; it desugars to &a <- b which desugars to storeat (&a, b) which is defined in the library to call intrinsic _storeat. So there are no lvalues (except variables). In particular a . field = 1; is an error. You have to write &a . field <- 1;. Not familiar .. too bad. Eliminating lvalues is a major simplification.

jfecher commented 2 years ago

What is the proposal to change here?

Ante doesn't currently have lvalues either, the lhs of := must be a ref type to be assigned to.

That being said, lvalues exist for a reason and can help usability. Any minor bump removed for learning ante is important considering how many foreign concepts it will have to most users. I don't think eliminating them is a major simplification as they are orthogonal to most other features and aren't too complex to begin with. It is for these reasons I'm not ruling out implementing them one day.

skaller commented 2 years ago

Well the only thing would be to make infix dot (.) a universal reverse application. I use px *. field as sugar for (*px) . field so you can get a value from a a pointer to a product with field field. And x &. field would then be a pointer to the field of object x, which has to be a variable (reference in your case).

So this is just an idea, it's just notation.

jfecher commented 2 years ago

Ah - related: I left out another part of . is that it automatically dereferences as well, like it does in rust via the Deref trait (though no such trait exists in ante currently at least). So in foo.bar foo can be a Foo, or a ref Foo. It is not recursive yet but could be in the future.

skaller commented 2 years ago

On reference types: I do not think these work. The rule for a monokinded type system is that type constructors TYPE->TYPE must be functors. If you can write a := b where a,b are reference types, you already have an lvalue/rvalue distinction which is fine if it's syntactic sugar as in C, but makes your type system unsound if it's like references in C++.

I think you can fix this with a kinding system, but I'm not sure: lvalue/rvalue distinction, if intrinsic in your type system, is a kinding distinction. Otherwise, parametric polymorphism can't work. If T is a type, then T* is a distinct type. You have both introduction (by address taking) and elimination (by dereferencing) you can get from T* back to T.

If you write ref T for a reference to T, what does ref (ref T) mean?

jfecher commented 2 years ago

ref (ref t) is a reference to a reference to t

skaller commented 2 years ago

Ok, so that's parametric. So what is the rule for a := b? Do you have two cases like: a: ref t, b:t or a:ref t, b: ref t?

jfecher commented 2 years ago

Right now := only has one type: ref a -> a -> unit. (1) The lhs may have sugar in the future like in rust, but the rhs should never have split cases regarding ref t or t.

(1) This exact type is temporary as once I implement effects this will likely use an IO effect or special mutation effect to mark it as impure.

skaller commented 2 years ago

Argghh. Sorry. Just found a bit in the manual that says you have to deref a reference. In other words what you call a reference is what everyone else calls a pointer. Your refs aren't references at all, they're pointers.

jfecher commented 2 years ago

These are still references. It sounds like you're not familiar with ocaml or sml which use this system as well. Rust is another such lang with a similar system where references are non-null pointers with lifetime tracking. C++ sticks out as a language where references are sugar that will refer to the value rather than reference depending on the context when you use a variable. I wouldn't say this is the standard though.

skaller commented 2 years ago

Ocaml doesn't have references like SML, it has mutable record fields, references are derived from them. But the situation there is quite different because Ocaml boxes everything. C++ references act as you say but they're not sugar, they're actual types. My confusion is because you use the word reference and also pointer. Also, another language I looked at had C++ kind of references. My compiler is written in Ocaml. I don't know Rust or Haskell well enough to write programs in either but I know roughly how Rust tracks lifetimes of borrowed references (and people find it confusing). My email is at internode dot on dot net drop me a line so we can chat if you like.