Whiley / WhileyCompiler

The Whiley Compiler (WyC)
http://whiley.org
Apache License 2.0
217 stars 36 forks source link

Support for Statically Sized Types #707

Closed DavePearce closed 7 years ago

DavePearce commented 7 years ago

(related to #654)

Introduction

An important consideration when compiling Whiley programs for different platforms is that of dynamically versus statically sized types. Roughly speaking, the former correspond to those which require dynamically allocated memory, whilst the latter correspond to those which can be entirely allocated on the stack (for example). In particular, statically sized types have a fixed size at compile time.

At the moment, the definition what is a dynamically sized type versus what is a statically sized type is somewhat unclear. Essentially, any type which contains one of the following is dynamically sized:

  1. Unbound Integers. Any type containing an int which is not specifically bounded (above and below) is dynamically sized.
  2. Unbounded Arrays. Any type containing an array whose length is not bounded by a compile-time constant is dynamically sized.
  3. Open Records. Any use of an open record is dynamically sized.

To illustrate some statically-sized examples:

type Point is { int x, int y } where 0 <= x && x <= 255 && 0 <= y && y <= 255
type Vec { int len, int[] data } where 0 <= len && len <= |data| && |data| <= 100

The latter is interesting because the array itself is still variable-length, but is known to occupy a fixed amount of space in memory.

Proposal

The primary issue with the current status-quo is that the definition of a statically sized type is somewhat awkward. To address this issue, the proposal here is to introduce the following specific statically sized types:

Here, the type uint:8 would be the generic definition of a byte. Likewise, a type such as int:8[?] is useful for describing (for example) raw pointer types in C, as this corresponds directly to int8_t*.

Subtyping

An important issue is how subtyping between such types will be supported. In fact, the proposal is that, from the perspective of subtyping, all integer types are equivalent to int. This leads to some potentially precarious situations:

function f(uint:8 x) -> (uint:4 r):
    return x

This program appears to suffer from a potential loss of precision. In many languages (e.g. Java, Rust, etc), such implicit coercions are not permitted for this reason. However, in Whiley, we can note that the above is not a concern because it will not verify. Indeed, it is roughly equivalent to this:

function f(int x) -> (int r)
requires 0 <= x && x <= 255
ensures 0 <= r && r <= 15:
    return x

This program will certainly fail verification. However, we could resolve this as follows:

function f(uint:8 x) -> (uint:4 r)
requires x <= 15:
    return x

In this case the program would now verify and, indeed, we see there is no loss of precision.

Challenges

Platform-specific size types. An interesting challenge is the interaction between the type usize and other integer types. The general concern is that certain expressions (arr[i]) will not verify on some platforms (e.g. if i has type int:64 on a platform where usize is uint:32. However, it seems likely that this won't be such a big problem in general. Consider the example in more detail:

function get(int:8[] arr, int:64 i) -> (int r):
    return arr[i]

This exhibits the general problem. However, the fact is that it won't verify either. In order for it to verify, we need a precondition that i < |arr|, in which case there is no problem.

Bitwise Operations. an interesting additional question is whether or not statically sized integer types should support bitwise operations. This would be convenient, since it would mean that uint:8 can replace byte. But, the problem is that it presumably leads to undefined behaviour for larger types (e.g. uint:32) due to the uncertainty of endianness? Java handles this by treating bitwise operators as operators on integers, not bits.

bambinella commented 7 years ago

I feel that maybe it should be sufficient to have a regular "low<x and x<high " constraint on the type and only provide backend hints for representation?

That would be number of bytes, memory alignment, and byte order. The backend should know what the FFI calls require and change representation based on performance characteristics. So it could be left open for optimization.

Unless you intend to extend Whiley with the equivalent of reinterpret_cast, which probably would be a bad idea.

Although I guess aliasing from pointers returned from FFI calls is an issue.

DavePearce commented 7 years ago

I feel that maybe it should be sufficient to have a regular "low<x and x<high " constraint on the type and only provide backend hints for representation?

Whilst that was my initial plan, I've eventually decided I want something more "clear cut". I don't want people to be asking the question "will this be compiled down to a u32?". I want them to know exactly when it will and when it will not.

Unless you intend to extend Whiley with the equivalent of reinterpret_cast, which probably would be a bad idea.

This won't be a problem. The compiler will handle all coercions seamlessly as necessary.

bambinella commented 7 years ago

Whilst that was my initial plan, I've eventually decided I want something more "clear cut". I don't want people to be asking the question "will this be compiled down to a u32?". I want them to know exactly when it will and when it will not.

Ok, well I disagree as I don't think it is semantical and therefore is more a hint than a language feature. I also think it should be something parametric: eg "uint<32>", so a library type. Where u32 is just an alias for uint<32>.

While having u32, u8 etc might make sense when you look at a single CPU it doesn't add up when you look at many. For instance, some older DSPs have 32 bit alignment and 24 bit representation. So a mere general approach that works across the spectrum would be better.

I have lots of notes on ways to approach this, but too much and incoherent to put as a comment on github...

But you also need to add alignment requirements to types in order to support SIMD. Which is a bit tricky in terms of syntax.

DavePearce commented 7 years ago

Hey,

I also think it should be something parametric: eg "uint<32>", so a library type. Where u32 is just an alias for uint<32>.

Hmmm, that could possibly work. Basically, I would provide a guarantee that uint<32> compiles down to a u32 ... hmmmm.

Not sure what I think. Ultimately, I need the compiler internals to have a notion of a fixed-width type (e.g. int:8) so I might as well expose that to the user for now. Maybe I'll later retract that, we'll see. It's hard to decide ahead of time what the real pro's and cons are.

DavePearce commented 7 years ago

Hey,

For instance, some older DSPs have 32 bit alignment and 24 bit representation. So a mere general approach that works across the spectrum would be better.

Right, then the type int:24 works well here.

No, a type like this:

type dspint is (int:32 val) where DSP_LOWER <= val && val <= DSP_UPPER

That works better. The key is that we're giving clear instructions to the compiler: compile this using 32bits of space; but, restrict the permitted values.

bambinella commented 7 years ago

On Sun, May 7, 2017 at 11:05 AM, David Pearce notifications@github.com wrote:

Hey,

For instance, some older DSPs have 32 bit alignment and 24 bit representation. So a mere general approach that works across the spectrum would be better.

Right, then the type int:24 works well here.

Well, they might repeat the most significant bit in the upper 8 bits... But this might be a thing from the past. I am sure there are some weird energy-efficient CPUs still, so I could look at the most common ones, on a rainy day. They tend to have online manuals.

I grew up on 8 bit computers (machine language on C64), so I find this stuff kinda fun and retro. :)

bambinella commented 7 years ago

I just want to point out that one of course should be able to query for byte-size and alignment of a specific integer subtype when stored in memory.

[...]

But for this to be useful I think you have to acknowledge that this is a programmer-knows-best type.

Then define exactly what it means. Does it mean that the compiler isn't allowed to store u32 in 2 bytes if it can prove that 16 bits is sufficient? Does it meant that alignment for u32 always is 4 or will it vary with hardware? Is it a storage constraint, or does it have an impact on registers. For instance is 4xu32 allowed to be reduced to 4xu16 or is it a strong hint that it should be put in a 4xu32 register for performance reasons that are only known to the programmer?

[...]

To be pedantic: if the difference between a constrained int and uint32 only is that I can assume with 90% chance that it is stored in 4 bytes vs 25% chance then it doesn't really help... :-)

So I guess my conclusion after this rant is that it SHOULD be a hard constraint. Because if it isn't it isn't very helpful and might encourage bad programming habits that leads to buggy code because I use a library that made too many assumptions based on a different target or no FFI or whatever.

TL;DR: To be useful it should fail hard on a mismatch on: number of bytes, alignment, byte order, signedness... or misery will ensue when used for system level programming.

bambinella commented 7 years ago

Another way to view this is to look at it from a generic programming point of view.

Maybe one should have some means to query what the hardware sizes are, and how they can be used (e.g. SIMD).

Then this could be a hard failure type if not supported efficiently, since that indicates a bug in generic (template style) code.

bambinella commented 7 years ago

C/C++ have types that states that they are fast or have at least so and so many bits: http://en.cppreference.com/w/cpp/types/integer

16 bit integer types in C:

int16_t // exactly 16 bits (not available if not supported) int_fast16_t int_least16_t uint16_t uint_fast16_t uint_least16_t

That would be better "int_at_least<16>", much more specific. But still missing alignment and byteorder.

bambinella commented 7 years ago

int16_t // exactly 16 bits (not available if not supported)

I don't mean that the program should fail compilation if it does not exists in the hardware. I think it should fail compilation if expressions accessing it are reachable.

bambinella commented 7 years ago

Hm, this simple issue is more tricky then I thought... I think basically one should list all the possible use scenarios and how various definitions break or enable what the programmer is looking for.

Digression: on some 32 bit CPUs scalar multiplies will give 64bit results, but SIMD give 32... IIRC. So what is fast...? Well, if you only do 32 bit scalar multiply as your last operation then 64bit is fast even if you only have a 32 bit CPU.

DavePearce commented 7 years ago

Does it mean that the compiler isn't allowed to store u32 in 2 bytes if it can prove that 16 bits is sufficient?

Yes, this is exactly what it means. One reason for this is to ensure interop with foreign code. But, also so that the programmer knows exactly what will happen.

I've got to be honest here, this discussion is a bit low level. Issues of alignment and these very subtle issues of representation are really for the (e.g. LLVM) backend. It's far from clear that I want to get down to the level of int_fast16_t.

For the moment, my main focus is on some kind of Minimal Viable Product, and this is well beyond that.

bambinella commented 7 years ago

For MVP the most obvious solution would be to just use a type annotation @bytes(4).

Then just create an alias for a bounded int with that annotation and put in the standard library.

DavePearce commented 7 years ago

You mean instead of int:32? Well, you are maybe right in that worrying about representation should be beyond what is needed for an MVP. But, I believe it's important for good interoperation with e.g. Java or even JavaScript.

bambinella commented 7 years ago

You mean instead of int:32? Well, you are maybe right in that worrying about representation should be beyond what is needed for an MVP. But, I believe it's important for good interoperation with e.g. Java or even JavaScript.

For the MVP I don't think you need an annotation at all, actually. I think you just need a type alias like the following (how do I create a non-nominal symbol?)

symbol u16 = (int n) where 0 <= n and n <= 65536

Then detect it in the backend and make it 2 bytes.

(by symbol I mean the same as C++ using u16 = type)

bambinella commented 7 years ago

Well in C++ templating syntax it ought to be something like this if C++ had allowed it:

using u<16> = uint16_t using u<32> = uint32_t

etc

Then u<_> could resolve to uint16_t | uint32_t

Hm... Never thought of that one before.

DavePearce commented 7 years ago

Right, so in the stdlib file whiley.lang.Int there are bunch of such definitions already:

/**
 * Represents all signed integers representable in 8bits
 * of space in the two's complement representation.
 */
public type i8 is (int x)
    where x >=-128 && x <= 127

/**
 * Represents all signed integers representable in 16bits
 * of space in the two's complement representation.
 */
public type i16 is (int x)
    where x >=-32768 && x <= 32768

/**
 * Represents all signed integers representable in 32bits
 * of space in the two's complement representation.
 */
public type i32 is (int x)
    where x >=-2147483648 && x <= 2147483647

(and more)

Detecting these in the backend is what my experimental java source backend is currently doing. It works for sure. But, one of the problems I have with this is that detecting this in the backend makes couples the compiler and the standard library.

bambinella commented 7 years ago

I think you will regret it if it means that you have to extend the IR. LLVM IR is quite large and specialized...

I take it that Whiley doesn't have nominal types at this stage, what if you added "tags" that taint types in a dataflow like fashion.

So if you tag something all derived values from there-on will carry that tag as long as the constraints hold. Using flow analysis.

This could also be used for tagging an array as sorted. Any operation that modifies element order would then remove that tag.

If that would be a welcome feature then maybe you can special case i32 etc for now with a plan for replacing it with a more general mechanism later?

This capability is something I have several times in C++. I guess this is related to "type state".

bambinella commented 7 years ago

Such tagging could be very useful later on for people wanting to implement an optimizing backend for linear algebra calling BLAS routines and the like provided by CPU/OS vendor.

bambinella commented 7 years ago

How about this:

  1. A property can represent the name and constraints of a tag.

  2. There is a standardized ontology of properties with names which the backend may know.

  3. The IR is extended so that "taggable properties" can be represented.

  4. The backend can easily work with these tags (just numeric ids) and doesn't have to detect anything.

DavePearce commented 7 years ago

Such tagging could be very useful later on for people wanting to implement an optimizing backend for linear algebra calling BLAS routines and the like provided by CPU/OS vendor.

Note that this is not a major use case from my perspective at this stage.

bambinella commented 7 years ago

Well, it would also help when adding floating point without the IR-feature affecting the verifier... I think. (edit: changed wording to make it more clear)

bambinella commented 7 years ago

Btw, if the primary need for this is for FFI, wouldn't it be sufficient to do this as an annotation of the parameters in the function signature? The the compiler can deduce that other types should have a FFI compatible representation for efficiency reasons if type conversion is identified as a bottleneck.

DavePearce commented 7 years ago

I'm gonna write this up as an RFC shortly and then we can look at it. Ultimately, I'm pretty confident this is what I want to do.

bambinella commented 7 years ago

Looking forward to the RFC. If you focus on FFI then you should also focus on vectorized register parameters which is happening now because 64 bit X86 now have all registers vectorized. But you probably can model this as tuples.

https://software.intel.com/en-us/articles/vectorcall-and-regcall-demystified https://msdn.microsoft.com/en-us/library/dn375768.aspx http://lists.llvm.org/pipermail/llvm-dev/2016-November/107679.html

DavePearce commented 7 years ago

This is now an RFC under review here: https://github.com/Whiley/RFCs/pull/3