Open AtomCrafty opened 1 year ago
In principle a good iead. But intrinsics need to be handled by each and every backend so it puts additional burden...
[...]
const_expr
can't itself be an unparenthesized addition or anything with a lower precedence, because it would be parsed as(ID + expr1) + expr2
, which does not follow the very narrow definition of an acceptable expression.
Yes, we'll watch out for common patterns that require parentheses but shouldn't. My gut feeling is that we'll mostly see "+ literal" or "+ parameter" expressions.
In addition to that the order of concatenation (endianness) depends on which of the two operands is larger. I 100% guarantee that this will cause a lot of confusion. Even though I have read the specification on this several times, I still couldn't tell you from memory which way is which.
Hmm, I just look at the to
part: If it has the positive offset, it's the big end, otherwise the little one. Is it just that the wording is bad, or do find something confusing with the definition itself?
My proposal: Introduce four new intrinsic functions:
load_le
,load_be
,store_le
andstore_be
, with the following signatures:
R load_xx(TypeSpecifier R, const T[*] mem, unsigned<∞> index);
void store_xx(TypeSpecifier R, T[*] mem, unsigned<∞> index, R value);
T
is the element type ofmem
(usuallychar
, but can be anything)R
is the type specified by the first argument andindex
is the index of the first array element to be read.
Concrete example (from your rewritten ISA file):
load_le(signed<16>, MEM, load_address)
Yeah, I like that.
(I'm not even going to pretend that I'm certain the
le
andbe
variants aren't swapped around, you figure that out :P)
Yup, these are correct 🤓
Advantages over manual range expressions:
- Much easier to read
- Endianness is immediately apparent
- No unnecessary restrictions on the types of index expressions used
- The returned value is already of the correct type (no cast required)
- The frontend would validate that
bitsizeof(R)
is actually an exact multiple ofbitsizeof(T)
.
Yeah, makes sense. I'd coin this as purely additional syntactic sugar for the range expression on address spaces. This way, backends wouldn't even need to support it explicitly.
The "syntactic sugar" bit is an interesting point. I feel like a lot of Eyck's criticism boils down to "I don't want any additional backend work", so I was contemplating ways around that over the past couple of days. I'm currently considering a "lowering" approach where the frontend writes an entirely new AST with limited node types, which can then be consumed by all backends. During lowering, constants would be evaluated, and operations like bit range selection, casts and the proposed extend/truncate intrinsics would all be replaced by a single "bit swizzle" operator that the backend can work with without the need to know about the concrete syntax element used. Ideally, it would also pull out side effects of expressions into their own statements, so the backend can treat all expressions as side effect-free. The new AST could be completely separate from the existing XText model, which means backends wouldn't need to even import XText and the format could be serialized into an IR.
Is it just that the wording is bad, or do find something confusing with the definition itself?
I've gotten more comfortable with it after a couple of hours of exposure, but I just think it's difficult to keep in mind which way around it goes. I could mentally create plausable mnemonics that go either way. I think it's similar to how the larger bit index is the left one. It makes sense, but it still feels strange because our reading order is left-to-right, while digits are ordered right-to-left.
In principle a good iead. But intrinsics need to be handled by each and every backend so it puts additional burden...
If we employ a lowering strategy, none of the backends would need to deal with this. They wouldn't even need to support the range index operator, because the frontend can lower it to concatenation of multiple reads.
Now that we have settled on a syntax for the range index operator, I would like to propose another pair (or actually two pairs) of intrinsic functions that would improve the user experience when working with external memory.
To recap: It was decided that the range index operator will support only a very restricted set of expressions for the index operands, which currently are only
ID + const_expr
andID - const_expr
. Note however that currentlyconst_expr
can't itself be an unparenthesized addition or anything with a lower precedence, because it would be parsed as(ID + expr1) + expr2
, which does not follow the very narrow definition of an acceptable expression.In addition to that the order of concatenation (endianness) depends on which of the two operands is larger. I 100% guarantee that this will cause a lot of confusion. Even though I have read the specification on this several times, I still couldn't tell you from memory which way is which.
My proposal: Introduce four new intrinsic functions:
load_le
,load_be
,store_le
andstore_be
, with the following signatures:R load_xx(TypeSpecifier R, const T[*] mem, unsigned<∞> index);
void store_xx(TypeSpecifier R, T[*] mem, unsigned<∞> index, R value);
T
is the element type ofmem
(usuallychar
, but can be anything)R
is the type specified by the first argument andindex
is the index of the first array element to be read.Semantically, the functions can be approximated by these macros:
#define load_le(R, mem, index) ((R)(mem)[(index)+bitsizeof(R)/bitsizeof(T):(index)])
#define load_be(R, mem, index) ((R)(mem)[(index):(index)+bitsizeof(R)/bitsizeof(T)])
#define store_le(R, mem, index, value) (mem)[(index)+bitsizeof(R)/bitsizeof(T):(index)]) = value;
#define store_be(R, mem, index, value) (mem)[(index):(index)+bitsizeof(R)/bitsizeof(T)]) = value;
(I'm not even going to pretend that I'm certain the
le
andbe
variants aren't swapped around, you figure that out :P)Advantages over manual range expressions:
bitsizeof(R)
is actually an exact multiple ofbitsizeof(T)
.