Qix- / arua-meta

Standards, RFCs and discussion of the Arua language
44 stars 2 forks source link

RFC: Support for Bit Fields #6

Closed corbin-r closed 8 years ago

corbin-r commented 8 years ago

I just had an idea that I know is in C/C++ that isn't mentioned in the spec. Will Arua have support for bit fields? Or will that be controlled via the type specifically, as mentioned in the README and https://github.com/arua-lang/proposal/blob/master/arua.grammar#L9? Or will Arua have support for syntax like:

i32 varName : 1; // Fill 4 bytes
f16 floatName : 4; // Fill to 8 bytes

Or something to this effect? I'm asking because I use bit fields quite a bit in C and find them helpful for controlling bit spacing in memory.

Qix- commented 8 years ago

Bit fields themselves, no - at least, there isnt any special syntax. Primitive integer types have arbitrary widths, meaning i6 and u4, for instance, are both valid types.

I haven't thought about how they act in structs, though I'd imagine I'd expect them to act pretty analogously to bit fields.

Qix- commented 8 years ago

Also keep in mind the "spec" as it's written is pretty baron right now. I want to get a bootstrapper out (it's in development right now) in order to play with the syntax and semantics and form the spec around that initially.

corbin-r commented 8 years ago

Right, yeah I know the spec is just getting started. And I understand, writing spec around actual development!

And okay cool, I figured that the arbitrary widths would be defined in the types themselves (e.g., i6 and u4). As for structs maybe if you could only control the width of a struct as seen with #pragma pack(n)? Something to consider in the future (far future), I know you don't want pre-processor statements so some form of syntax would have to thought up if you decide to include some form of struct packing. Thanks for the speedy response!

Qix- commented 8 years ago

Actually there is a similar system to pragmas, called details. Details modify the underlying AST node configuration so as to modify the behavior of the node itself.

An example that springs to mind is the VGA buffer in real mode for kernel development. It's at a fixed physical address (0xB8000), but Arua itself doesn't work with addresses - only references.

Therefore, you'll have to use details to define its parameters. What the details are exactly is still to be determined, but will look similar to the following:

#[pack: 1]
struct VgaPixel
        u4 flags
        u1 bright
        u1 blue
        u1 green
        u1 red
        u8 char

fn main(argv [str]) i32
        #[address: 0xB8000]
        #[size: 80, 25]
        #[alloc: extern]
        &[[VgaPixel]] vgaBuffer

        # can now use it as a normal array.
        vgaBuffer[0][0].red = true # we can use true because `true == (1 as u1)`
        vgaBuffer[0][0].bright = true
        vgaBuffer[0][0].char = '!'

        ret 0

Obviously that is completely prototypical but Details are definitely something that will be there from the first version of the bootstrapper. Syntax isn't 100% decided on (needs more bikeshedding) but you get the idea.

These work because each AST node has a registry that is propagated down to children that don't have things explicitly set.

For instance, foo in the below example inherits the hypothetical no-optimize detail from main:

#[no-optimize]
fn main(args [str]) i32
        foo i32 = 0
        ret 0

This system has already been implemented in the bootstrapper with the help of the yaml-cpp package. Of course it'll be replaced in the self-hosted compiler with something a little more robust but for now it's sufficient.

corbin-r commented 8 years ago

Oh that's actually very clean looking. I like the syntax. Of course this might change over time but it looks very usable!

Qix- commented 8 years ago

I think so too :) I'll keep this open for discussion until they're semi-stable in the bootstrapper. I'd like more feedback on this kind of thing; bitfields are indeed useful.

corbin-r commented 8 years ago

Thank you. :+1:

Being a student that is getting into Kernel/OS/Programming language development this is a very interesting repo for me!

Qix- commented 8 years ago

I'm designing Arua with OS dev in mind, with every step considering the kinds of flexibility it'll have to do. To reiterate, the focus of the language (less Details) is to show intent, but obviously sometimes you have to get down deeper into the control end of things - that's what Details are for.

Good question by the way, I hadn't considered how bitfields would be transcribed.

corbin-r commented 8 years ago

Oh really? Well for that I thank you and your details haha.

Qix- commented 8 years ago

Of course! Keep them coming.

Qix- commented 8 years ago

Another thought on bitfields, thinking about optimizations: using the primitive types (i1, u6, etc.) we can really improve optimizations on structs that don't require exact representation.

For instance, if you aren't persisting a struct (it's just being used ephemerally, e.g. in memory) and you have the following properties:

struct Foo
    a u1 = 0
    b u1 = 1
    c i32 = 1337
    d u2 = 3

With no optimizations, it'd be grouped into a 7 byte structure:

0: 0000 000a
1: 0000 000b
2: c
6: 0000 00dd

but with optimization, we could even coerce these types given a few strategies:

Minimal Pack (aka C-style)

0: 0000 00ba
1: c
5: 0000 00dd

Pack

0: 0000 ddba
1: c

Pack + Align (best space)

0: c (aligned to the base, for improved speed if allocated along page boundary)
4: 0000 ddba

Align (potentially best performance - aligned and does not require bitwise operations)

0: c (aligned to the base, for improved speed if allocated along page boundary)
4: 0000 000a
5: 0000 000b
6: 0000 00dd 
---

As far as interop, we could even track how these changes are being made and generate adjacent C struct code for use in consuming C applications so that the optimized packed type can be used with the correct labels, etc.

Another thing to note: by not having explicit bitfields here, we can optimize differently for different use-cases, systems, configurations etc. and absolutely no code will have to change. Bitwise operations would be generated if necessary and would be completely transparent to the developer - however, still completely analyzable given the strategies are well defined.