basilTeam / basil

Fast and flexible language exploring partial evaluation, context-sensitive parsing, and metaprogramming. Compiles JIT or AOT to native code.
BSD 3-Clause "New" or "Revised" License
124 stars 11 forks source link

Add raw pointers and pointer types. #30

Open elucent opened 3 years ago

elucent commented 3 years ago

Basil compiles directly to native code, so it should be relatively straightforward to support unsafe pointer types and pointer arithmetic, at least from a codegen perspective. While the primary means of passing around reference types should be through garbage-collected pointers, supporting raw pointers would allow Basil to more easily express low-level function prototypes - useful for interacting with foreign functions.

I propose we introduce a new primitive type kind, "Pointer", that has a single parameter - the type it points to. We'll tentatively denote the type of a pointer to some type T as T ptr. Pointer types can be coerced generically to pointer types that point to a generic type like Any or a type variable. Besides this, pointer types support no other implicit coercions.

Pointers should, at minimum, support a few primary operations:

A few open questions:

dumblob commented 3 years ago

Generally references (i.e. disguised pointers but with less free semantics) are a necessity for a general purpose language. I think pointers (i.e. the free semantics as pointer arithmetic etc.) should definitely be disallowed by default like Rust, V, and other modern langs do. And only allowed in some sort of unsafe { } block or other visually explicit denotation.

A few open questions:

  • Should we introduce new syntax for dereference and address-of operations? We could replicate C-style &val and *ptr syntax with a few new tokens. One less-invasive alternative would be Zig-style dot syntax: val.& and ptr.* would be easily expressed in the current Basil semantics.

By default (say outside of unsafe { } blocks) a reference (pointer) shall be fully indistinguishable from a non-reference (non-pointer) value. Many newer as well as older/traditional languages have proven that it's really unnecessary to make it explicit because safe built-in statements/operations behave on the surface the same as with non-ref values. And it seems making it explicit (that we want to deal with a ref) just on one place (e.g. during function argument definition) is more than enough.

  • Perhaps we could add some easier pointer arithmetic instructions than converting to and from Int? Maybe it could be type-based: ptr + Int could add the size of an Int to the address contained in ptr.

Yep, why not. But only in the unsafe { } block. Otherwise compile-time error :wink:.

elucent commented 3 years ago

I'm kind of morally opposed to unsafe as a language feature - if we add a perfectly functional feature that is often the best solution to a problem, why actively discourage its use? I don't think it fits Basil's theme of flexibility to strike down useful features as "undesirable"...and deal with that in no way other than to make that feature more annoying. It's a little more justifiable in Rust due to their static analysis, but Basil is garbage collected! So we don't need to limit ourselves in order to get memory-safe allocations.

If it wasn't clear though, the predominant approach towards reference semantics and memory management will be through safe, garbage-collected reference types - I've created a new issue for those, and the intent is that they'll be the recommended kind of pointer type for most workloads.

dumblob commented 3 years ago

Ah, ok. This reminds me of Nim's references.

Now it's clear that it should be easy to judge about the source code whether we're dealing with "dangerous pointers" or "safe pointers". That could be enough for me as a linter or some compiler option (akin to -Werror -Wall -Wextra or perhaps -Wpointer) could be made to fail compilation of sources with raw pointers :wink:.