Whiley / RFCs

Request for Comment (RFC) proposals for substantial changes to the Whiley language.
3 stars 2 forks source link

Statically Sized Types Addendum #100

Open DavePearce opened 2 years ago

DavePearce commented 2 years ago

(see #99 , #98 , #97, #75)

This basically wraps up a bunch of outstanding issues surrounding statically sized types in Whiley, and should result in an update to RFC#0003. Some terminology:

This semantic is actually largely implemented in Whiley already, though it needs clarification around the terminology. There are also some interesting cases to consider. For example:

string str = "hello world"

Should this compile? Certainly it currently does. However, if we assume that "hello world" is statically allocated then its natural type is &string. Still, its a constant so perhaps we can do the type conversion based on the target type? That means the above compiles, as does this:

&string str = "hello world"

The difference is that the former allocates a new string on the heap, whilst the latter simply provides a reference to the static string. In Rust terminology, &string corresponds to &str whilst string on its own corresponds with String. The only real difference is that we're missing lifetimes. The intention is that lifetimes would be handled at verification time.

Pointer Size

Many languages (e.g. Rust) support a special type usize, however I propose not to support that in Whiley. I believe it causes more problems than it solves. Instead, we will modify the syntax for array types from T[] to T[U] (e.g. int32[uint64]). The type within the braces clarifies the type used to store the length of the array. We could pick a suitable default for types where this is not specified (i.e. uint64), though that may not be necessary in the first instance.

Type Bounds

The following should not compile under this scheme:

method m<T>(&T p, T x):
    *p = x

The reason is that we cannot be sure T is statically sized. Like Rust, we probably want a way to express that T must be statically sized. A similar issue is around methods which operate on int-like things. I'm not sure what's a good example here, but we could have indexOf<T,U>(T[U] items, T item) -> (U|null r) where we want to specify that U is an (unsigned) integral type.

Foreign Function Interface

Another important question is how types in C are represented. For example, does int32 * correspond with int32[?] or &(int32[?]). I think logically, its the latter. In particular, int32[?] is kind of a useless type (see #75). We cannot copy it, and I don't think we could even assign to whilst preserving value semantics.

Lifetimes

An important question is how lifetimes are managed. Clearly, I don't want to have lifetimes in the language proper, and rather want them deferred to verification. Therefore, we assume the verifier is equipped with logic to reason about lifetimes. Consider this:

method m() -> &int32:
   int32 x = 1
   return &x

This would fail verification because the verifier can easily tell there is a problem. Here's a harder case:

method n(&int32 p, &int32 q) -> (&int32 r):
    return p

method m(&int32 px) -> &int32:
   int32 y = 1
   return n(px,&y)

This is actually safe, but will fail verification. That's reasonable, since the specification of n() allows it to return q (even though it doesn't right now). We can fix this by strengthening our specification:

method n(&int32 p, &int32 q) -> (&int32 r)
ensures r == p:
    return p

Now, the verifier has enough information to see that this is safe.