Specify behavior of intrinsic expressions, such as `sizeof`

jopperm commented 2 years ago

Idea: Define sizeof(T) as the number of bytes required to represent T (as in C). An additional bitsizeof operator may make sense in the future.

The expression's type is currently hard-wired as a 32-bit integer (think: size_t in C).

Alternatively, as the value is basically a compile-time constant, @atomcrafty proposes to make it subject to the integer literal type rules, i.e. choose the minimum bit width to represent the size.

jopperm commented 2 years ago

The parser already understands (bit)sizeof and friends, so this issue is mainly a documentation task.

@AtomCrafty points out that the actual values returned by these operators have to be discussed and specified. In particular, do we pad structs, and how do we handle alignment in CoreDSL?

eyck commented 2 years ago

Padding is an optimzation of the compiler and alignment implements constraints of the processor. Therefore I strongly suggest to do neither allignment nor padding.

AtomCrafty commented 2 years ago

I agree that from a CoreDSL point of view it makes sense to remove all unnecessary padding, as that would reduce the amount of generated circuitry. However from a C standpoint we should keep in mind that structures might not only exist in dedicated hardware, but also in regular memory. Suppose some instructions would have to read data structures from memory, like an interrupt vector, paging tables or structured exception information. Those structures might include padding bytes. I believe a good approach would be to make structure layouts configurable via attributes. For example [[struct_layout("pack")]] in front of the type declaration to pack it as tightly as possible and [[struct_layout("align")]] or something similar to align primitives to multiples of their own size. Alternatively we could introduce attributes to explicitly specify the offset of a field: [[field_offset(32)]].

eyck commented 2 years ago

Actually we describe hardware within a processor. Layout in memory (which is basically an extern array) has to be described explicitly and will not depend on the processor-internal representation. Therefore I still opt for no padding and no alignment.

AtomCrafty commented 2 years ago

That still means we need to provide the facilities to explicitly describe the layout. And I would rather see that done with attributes than padding fields. Then again, that's just personal preference, so I won't fight you over it ^^

neithernut commented 2 years ago

Well, you'd first need to define a mapping between structs/unions and a bitvector, e.g. an uint-representation of a struct. I'm not aware of such a mapping in current CoreDSL. I suppose casting a struct to anything else isn't defined either. Assuming a struct is prepresented by a ("packed") concatenation of its member's uint-/ or bitvec-representations, you can easily add arbitrary padding via dummy-members (which some also do in software, you often see such dummy members in the Linux uapi). If you want to avoid members such as "dummy1" and "dummy2", we could introduce unnamed members.

AtomCrafty commented 2 years ago

Those dummy members are exactly what I meant by "padding fields". The issue I see with those is that you will always have to manually calculate how large the padding has to be, and update the padding when the bit width of another field changes. It's error prone and doesn't clearly communicate the intent. That's why I would instead suggest to have the frontend automatically generate these padding fields in a way controlled by attributes.

struct T {
  unsigned char field1;
  unsigned<24> dummy;
  unsigned int field2;
  unsigned int field3;
}

struct T {
  unsigned char field1;
  unsigned int field2 [[align(32)]];
  unsigned int field3;
}

struct T {
  unsigned char field1;
  unsigned int field2 [[field_offset(32)]];
  unsigned int field3 [[field_offset(64)]];
}

eyck commented 2 years ago

Let's take the if, maybe, and potentially aside: is there any concrete example or need to specify padding and alignment when describing the inner workings of a processor? If not we can stop the discussion and define the bitsizeof operator and the sizeof operater as sizeof(T) = (bitsizeof(T)+7)/8

jopperm commented 2 years ago

Is there any concrete example or need to specify padding and alignment when describing the inner workings of a processor?

Among the things we currently dabble with, I don't think so. To that end, I'd clarify in the spec that all structs are packed as an immediate course of action. OK?

In the (far?) future, I can see use-cases for "adding structure" to address spaces, maybe to alias a specific memory range known to contain records, or resembling status registers of an MM'ed peripheral device. Then, I think the attributes proposed by Mario are way nicer than dummy fields.

jopperm commented 1 year ago

Diff

Minres / CoreDSL

Specify behavior of intrinsic expressions, such as `sizeof` #15