Closed AtomCrafty closed 1 year ago
Another issue that came up: sizeof
can complicate this further, since expressions could depend on the size of other constants, which in turn can depend on the value of other expressions, possibly creating circular dependencies:
unsigned<XLEN> XLEN = bitsizeof(XLEN);
I would propose to simply disallow sizeof
and bitsizeof
of constant definitions during constant evaluation. sizeof(int)
and other types should be valid though.
Julian and I were talking about how type validation of instruction sets should handle incomplete types like unsigned<XLEN>
.
One idea was to introduce an attribute that can be applied to ISA level constants to specify a default value used during analysis.
Any suggestions what we should name this attribute?
Another issue that came up: sizeof can complicate this further, since expressions could depend on the size of other constants, which in turn can depend on the value of other expressions, possibly creating circular dependencies:
unsigned<XLEN> XLEN = bitsizeof(XLEN);
I would propose to simply disallow sizeof and bitsizeof of constant definitions during constant evaluation. sizeof(int) and other types should be valid though.
The example is illegal anyways as you reference a variable before declaring it. I think disallowing sizeof and bitsizeof is problematic since the following would not be possible:
InstructionSet A {
architectural_state {
unsigned XLEN;
register<XLEN> R;
}
}
InstructionSet B extends A {
architectural_state {
unsigned ACC_LEN=2*sizeof(R);
register<ACC_LEN> ACC;
}
}
Core C provides B {
architectural_state {
XLEN=20;
}
}
Wrt. 4: To my understanding you cannot do type validation on InstructionSet
because of the parameterization (as in the example above). The validation can only be done at the Core
definition level. Therefore I use in our generator class a GenerationContext
which holds the Core
definition to derive the values of constants and sizes of type. This context the also allows to cache the values so that the AST only needs to be traversed once for a particular value within a particular context
Oh, okay. The specification doesn't mention that expressions can only contain constants declared earlier. Adding that as a rule would unfortunately not prevent cyclic dependencies, as you can see here:
int XLEN;
int<XLEN> reg;
XLEN = bitsizeof(reg);
My current implementation is basically an iterative solver. It starts off knowing none of the values and in each iteration it calculates new constant values determinable only from the ones known so far. If none can be determined anymore, the remaining must have a circular dependency.
If we need sizeof
, I can probably treat each constant's size as another implicit constant that can be depended on and depend on other constants.
1. Duplicate type declarations Are type declarations within a super type visible in the derived type?
Yes.
If yes, how do we handle duplicate type declarations? Are they always forbidden? Do we perform a memberwise comparison to test if they are identical?
Let's keep it simple and simply forbid redeclaring user types.
2. Cyclic type dependencies What do you think, should the marker be on the name before or after the
extends
keyword?
After the extends
makes more sense to me, because the extends-clause is what constitutes the error.
3. Constant evaluation procedure
The procedure you propose sounds right to my. Yes, you definitely will have to iterate the evaluation until a fix point is reached (or detect cycles and error out). My mental model for the implementation parameters is that they behave like variables, syntactically. So they must be declared before use (either in the current architectural_state section, or in one of the parent ISAs); multiple declarations per ISA are prohibited; multiple assignments are ok. Assignments are evaluated in "program" order.
+1 on allow sizeof
and friends; Eyck's example unsigned ACC_LEN=2*sizeof(R);
is a good one I think.
Should we restrict the types that can be used on implementation parameters to unsigned<...literal...>
/signed<...literal...>
, to mitigate the back-and-forth between constant evaluator and type checker? We could also add a new alias param
for a "big enough" integer type, as in param XLEN = 32;
.
4. Attribute for default values used during analysis Eyck: To my understanding you cannot do type validation on InstructionSet because of the parameterization (as in the example above). The validation can only be done at the Core definition level.
@eyck: Agreed, writing an InstructionSet means flying blind, as in C++ templates. Doesn't matter for non-interactive use. The question is, can/should we improve the editor experience by assuming sane default values, specified via an attribute? Mario means something like
unsigned<32> XLEN [[assume=32]];
As a language user I would expect some influence in the implementation if writing some attribute. Therefore to me this would be missleading. Just for type checking in the editor I think it is safe to implicitly assume a value without writing it explicitly. This value could be set in the editor preferences but should not be part of the code.
Agreed, writing an InstructionSet means flying blind, as in C++ templates. Doesn't matter for non-interactive use. The question is, can/should we improve the editor experience by assuming sane default values, specified via an attribute? Mario means something like [...]
I would not spend to much effort on the editor experience. Even the Eclipse Development Tools (CDT) cannot flag problems if the template parameters are not known. We should follow the same approach. If we cannot check we cannot flag... And eventually once the CoreDef is given the value can be evaluated.
BTW,
unsigned<32> XLEN [[assume=32]];
means to me basically the same than
unsigned<32> XLEN =32;
since I can assign the value later on....
Let's keep it simple and simply forbid redeclaring user types.
What about core definitions? Multiple inheritance means a core can see multiple distinct types of the same name.
Yes, you definitely will have to iterate the evaluation until a fix point is reached
Assignments are evaluated in "program" order.
These two statements directly contradict each other. Having a fixed evaluation order is impossible.
Should we restrict the types that can be used on implementation parameters [...], to mitigate the back-and-forth between constant evaluator and type checker?
There is no point in doing that now since I already implemented the more powerful variant that allows arbitrary (compile-time evaluable) expressions (#49).
[W]riting an InstructionSet means flying blind, as in C++ templates.
I'm not sure I understand you right... are you saying we don't want any validation for instruction sets? C++ templates don't let you "fly blind", any decent IDE will report most errors you make in a template, even without knowing the exact instantiation parameters.
When we last spoke about this, you said we should validate as much as possible without the full picture. The "default value" attributes would work like IntelliSense annotations in C++, providing example values with which the analysis would check program validity. Without default values, every expression that accesses unassigned state variables would be impossible to validate.
As a language user I would expect some influence in the implementation if writing some attribute. Therefore to me this would be missleading.
Using attributes and annotations for analysis only is common practice across pretty much all languages I have worked with. For example [[nodiscard]]
in C++, [DebuggerStepThrough]
in C#, @SuppressWarnings()
in Java, etc.
unsigned<32> XLEN [[assume=32]];
means to me basically the same than
unsigned<32> XLEN =32;
since I can assign the value later on....
That is not how constant assignments work. If you assign 32 when declaring the constant, it can never be re-assigned to a different value (only re-assignments to 32 are valid), effectively making the parametrization useless.
In the statement
unsigned<32> XLEN =32;
is XLEN not a constant, it is just an unsigned variable. But the value XLEN is fixed in the instructions section. At least this is as we (@jopperm and me) discussed and defined it in the language. Constants in the architectural_state section are declared using const
.
For each parameter/reset value in the combined architectural state of a Core, a CoreDSL 2 frontend must be able to evaluate all assignments to a unique value at compile time. In other words, there must either be a single assignment in the combined architectural state, or in case multiple assignments exist, the expressions on their right-hand sides must evaluate to the same constant value after a fix point is reached.
Example
InstructionSet B extends A { architectural_state { XLEN = 32; } } InstructionSet C extends A { architectural_state { XLEN = 16; } } Core R provides A { } // ERROR: missing definition Core S provides A { architectural_state { XLEN = 17; } } // OK Core T provides B { } // OK Core U provides B { architectural_state { XLEN = 32; } } // OK Core V provides C { architectural_state { XLEN = 32; } } // ERROR: conflicting definitions Core W provides B, C { } // ERROR: conflicting definitions
Why do I even bother implementing what the specification says if the specification is apparently completely wrong? Those are 10 hours of my life I'm not going to get back.
I'll setup an offline meeting to discuss this.
We decided to change the parameter elaboration to make the last assignment effective.
@eyck, @AtomCrafty: Please review my changes to the spec. I also added a paragraph to the rationale page to document our motivation for this change.
I would change the wording Cyclic dependencies between parameters must be detected and reported as errors.
(elaboration.md) to simply Cyclic dependencies are not allowed
. The first statement refers to implementation details....
It seems that the changes reflect what we discussed, however the ArchStateItem
grammar on this page doesn't make any sense. It allows several nonsensical statements:
a;
const x = 4;
x = 3 [[attribute]];
The first is legal althoug results in a NOP. The last one might make sense if the assignment shall be qualified.
The second one is definitly garbage. So I guess it should read
ArchStateItem ::= ('const' TypeSpecifier)? ID '=' ConstantExpression Attribute* ';'
[...] the ArchStateItem grammar on this page doesn't make any sense [...]
Ah thanks for spotting this, this has to be a relic from an early version. At some point, I introduced separate subsections for ASI declarations and assignments, but have forgotten to update the pseudo-grammar.
Fix: I made the TypeSpecifier
parts in the declarations-subsection non-optional.
@eyck The first is not legal, though admittedly it's syntactically valid. A program containing this statement should be rejected by the validator because a;
is not a valid expression statement. Only assignment expressions, pre/post-increment and decrement, as well as function call expressions are permitted inside of an expression statement.
I don't know where you take this statement from. But as a user and with the premise 'CoreDSL is C-like' I would expect a;
to be valid statement.
I suppose you are both right: a;
should be a legal statement in an instruction's behavior, but this form is not on the list of sanctioned statements for the architectural_state
section.
Well I stand corrected. I'm surprised C allows this, as most languages disallow expression statements without side effects. Though I'm certain 100% of occurences will be unintentional, so we should at least emit a warning if it ever comes up.
I agree, this should yield a warning as most C/C++ compilers do.
Issue: because of the way the architectural_state section is parsed, we can't know whether a declaration or an assignment comes first in the code. So the following two snippets result in the same AST, even though one is invalid and the other isn't. I do not want to manually retrieve the corresponding CST nodes to compare line numbers, so I guess we need to change the grammar again.
architectural_state {
int x;
x = 5;
}
architectural_state {
x = 5;
int x;
}
I do not understand the problem. Both examples are a sequence of statements. From the point of assignment only statments before the assigment are considered. Even if I collect all architectural states accoding to the traversal order I get a sequence where the same approach applies. So what's the issue here?
Actually this was the way the scoping was implemented (at least in the initial version). So in the latter case the Scping would not return any candidate (since x has not been declared before the assignment) so the framework would flag the assignment as having an error.
The issue is that with the current implementation it is impossible for the scope provider to know which declarations come before a given assignment, because they are parsed into two separate lists. It is not known to the scope provider how these two lists were interleaved in the original code. You are right that this used to be solved differently, but I changed how the section was parsed in order to simplify the grammar a while back, which wouldn't have been a problem had we kept the old specification. I now have to revert said grammar change because the order of statements is relevant with the new semantics.
Just for reference, you mean the separate lists here, correct?
I agree, seems that a rollback is necessary here. Maybe you can offer getDeclarations()
and getAssignments()
helpers somewhere?
Closing this, at it has been addressed recently in #59.
I'll open this as a collection thread for questions and discussions related to the elaboration phase. I have started work on the elaboration implementation and will update this as I encounter new issues.
1. Duplicate type declarations
Are type declarations within a super type visible in the derived type? If yes, how do we handle duplicate type declarations? Are they always forbidden? Do we perform a memberwise comparison to test if they are identical?
2. Cyclic type dependencies
My current implementation reports cycles in the inheritance hierarchy like this. What do you think, should the marker be on the name before or after the
extends
keyword?3. Constant evaluation procedure
My current idea for the evaluation of constants is the following five step algorithm:
The basic idea is to calculate the values first and care about types later, because the constant back and forth between determining types and values is what's currently causing us so many headaches.
My main question is how step 3a should deal with dependencies between constants.
Here is a simple example (A).
y
needs to be evaluated first, becausex
depends ony
:Here is an example (B) of an impossible constellation. Neither constant can be evaluated because there is a cyclic dependency. I would propose to generate a similar error message to the one for cyclic type dependencies.
The most interesting example (C) is this: We again have a cyclic dependency, but a third assignment can break the cycle. This would be resolvable with a fixpoint solver. Do we want that?
Example C also has another issue: suppose the first two assignments were in an instruction set and the third in a core definition providing said instruction set. That would mean the instruction set produces an error, but the core compiles without issues. I'm curious about your input here.
Also, are 4c and 4d allowed? The spec isn't explicit on this.