This is an ongoing PR (it's not ready for review yet).
This PR contains the formalization of expression typing rules based on P4 spec and a syntactic categorization of types.
Notation
We should agree on the judgment forms:
[ ] type well-formed
[ ] type equivalence
[ ] expression typing. they return type $\tau$ since by simply having a type identifier we need to return declared types.
[ ] casting
Syntactic categorization of types
I considered multiple syntax categories for major types:
surface type ( $\rho$ ): types that are written by a programmer in a P4 program without using declarations
declared type ( $\nu$ ): types declared in a program
synthesized type ( $\sigma$ ): types generated by the type system
Keep the following points in mind:
declared type and synthesized type syntax may change as we discuss the issues of expression typing and formalize the typing of statements and declarations.
the types produced by the type system could also just be surface types or declared types.
well-formedness is only defined for types that a programmer writes, hence, there's a chance such a type wouldn't be well-formed. However, type equivalence is defined on all categories.
Furthermore, there are two other syntactic categories for argument types and function return types.
Well-formed types
[ ] P4 spec states that for type bit<w>, w must be an expression that evaluates to a compile-time known value (see Section 18.1) that is a non-negative integer. So w must be of type Int, correct? (rule bitString-T)
[ ] Furthermore, When using an expression for the size, the expression must be parenthesized. We’re not checking the parenthesization. How should we go about adding it? We can either add the case(exp) to exp syntax definition and then check it in the well-formed rules, or add it directly to the syntax definition of the surface type. Personally, I prefer the former.
Type equality/unification
Expression typing rules
The issues regarding typing of expressions are in order of which P4 spec discusses expressions.
Enums
[x] can serializable enums have any underlying types? P4 spec only talks about them with bit<w> and int<w> as underlying types? Currently, the doc doesn't discuss this, when it does, it'd be in the declaration of enums. Answer: No. According to P4 spec: "It is also possible to specify an enum with an underlying representation. These are sometimes called serializable enums, because headers are allowed to have fields with such enum types. This requires the programmer provide both the fixed-width unsigned (or signed) integer type and an associated integer value for each symbolic entry in the enumeration. The symbol typeRef in the grammar above must be one of the following types:
an unsigned integer, i.e. bit for some constant W.
a signed integer, i.e. int for some constant W.
a type name declared via typedef, where the base type of that type is either one of the types listed above, or another typedef name that meets these conditions."
[ ] under what assumptions can an enum with underlying type be casted to another? I'd assume that they're underlying types must be equivalent at least, but, what about their fields? what if the fields have been instantiated?
Ternary/conditional
[ ] t1 = t2 or t1 \equiv t2?
Unary operation
[ ] I didn't encode unary plus since it's noop.
[ ] Symbol suggestion for bitwise complement that can be presented with \texttt.
[ ] any case that we want to insert implicit cast for?
Binary operations
Bitstring access/slice:
[ ] why does Petr4 forces the context to be constant? Details can be found in issue
[ ] what are acceptable types for L and R? If enum is acceptable then why does P4 insert implicit cast in the example?
[ ] Also, P4 spec states that an enum with underlying type of bit<w> or int<w> can be accepted as L and R, what if the underlying type is a type synonym? (the case of dummmy in the following example)
Questions to answer:
Do they have to have the same type? No. It's specifically mentioned in P4 spec.
If not, how do we insert implicit cast? Do we even need to insert an implicit cast? For example, why do we have to insert cast in the example given P4 spec implicit cast section.
Reminder: a slice expression looks like exp[L:R]. E.g., is the following slice expression well-typed?
P4 spec (v1.2.3) states: Extraction of a set of contiguous bits, also known as a slice, denoted by [L:R], where L and R must be expressions that evaluate to non-negative compile-time known values, and L >= R must be true. The types of L and R (which do not need to be identical) must be one of the following:
int - an infinite-precision integer (section 7.1.6.5)
bit - a W-bit unsigned integer where W >= 0 (section 7.1.6.2)
int - a W-bit signed integer where W >= 1 (section 7.1.6.3)
a serializable enum with an underlying type that is bit or int (section 7.2.1).
Types that have an equality instance (fyi for parisa)
Right now, the formalization only has equality for expressions of types (this list might get longer as I go through P4 spec):
error
bool
bit
int
varbit
int
enum
serializable enum
header
However, Petr4 has equality for expressions of types in addition to the ones above:
string
matchkind
array
set
list
tuple
header union
struct
newtype
Implicit cast insertion
[x] The operators that don't have any implicit cast insertion do not have any example or explanation stating that they could have implicit cast insertion in P4 spec. Some of them (such as shift or bitwise operators) have illegal expressions here. Is there any other place (other than the ones already included in the rules) that must allow implicit cast insertion? yes, assignment, recursive insertion of cast, etc. I opened an issue and pr on P4 spec for this. It's still ongoing as I'm going through the cycle of type declaration, type well-formed, and operations on the expression of a specific type.
Explicit cast
[ ] should we include compiler warnings in our formulization?
For example, "int → int: converts the integer value into a sufficiently-large two's complement bit string to avoid information loss, and then truncates the result to W bits. The compiler should emit a warning on overflow."
Declaration typing
In parallel to formalizing expression typing, I'm writing declaration typing since declared types are used by expressions.
Enums
[x] is implicit cast of initializer allowed? The example in P4 spec allows it and so does Petr4. Yes.
[ ] add the check of whether an initializer fits into the bounds of its type.
This is an ongoing PR (it's not ready for review yet). This PR contains the formalization of expression typing rules based on P4 spec and a syntactic categorization of types.
Notation
We should agree on the judgment forms:
Syntactic categorization of types
I considered multiple syntax categories for major types:
Keep the following points in mind:
Furthermore, there are two other syntactic categories for argument types and function return types.
Well-formed types
bit<w>
,w
must be an expression that evaluates to a compile-time known value (see Section 18.1) that is a non-negative integer. Sow
must be of typeInt
, correct? (rule bitString-T)(exp)
toexp
syntax definition and then check it in the well-formed rules, or add it directly to the syntax definition of the surface type. Personally, I prefer the former.Type equality/unification
Expression typing rules
The issues regarding typing of expressions are in order of which P4 spec discusses expressions.
Enums
[x] can serializable enums have any underlying types? P4 spec only talks about them with
bit<w>
andint<w>
as underlying types? Currently, the doc doesn't discuss this, when it does, it'd be in the declaration of enums. Answer: No. According to P4 spec: "It is also possible to specify an enum with an underlying representation. These are sometimes called serializable enums, because headers are allowed to have fields with such enum types. This requires the programmer provide both the fixed-width unsigned (or signed) integer type and an associated integer value for each symbolic entry in the enumeration. The symbol typeRef in the grammar above must be one of the following types:an unsigned integer, i.e. bit for some constant W.
a signed integer, i.e. int for some constant W.
a type name declared via typedef, where the base type of that type is either one of the types listed above, or another typedef name that meets these conditions."
[ ] under what assumptions can an enum with underlying type be casted to another? I'd assume that they're underlying types must be equivalent at least, but, what about their fields? what if the fields have been instantiated?
Ternary/conditional
t1 = t2
ort1 \equiv t2
?Unary operation
\texttt
.Binary operations
Bitstring access/slice:
bit<w>
orint<w>
can be accepted asL
andR
, what if the underlying type is a type synonym? (the case ofdummmy
in the following example)Questions to answer:
Reminder: a slice expression looks like
exp[L:R]
. E.g., is the following slice expression well-typed?P4 spec (v1.2.3) states: Extraction of a set of contiguous bits, also known as a slice, denoted by [L:R], where L and R must be expressions that evaluate to non-negative compile-time known values, and L >= R must be true. The types of L and R (which do not need to be identical) must be one of the following:
Types that have an equality instance (fyi for parisa)
Right now, the formalization only has equality for expressions of types (this list might get longer as I go through P4 spec):
However, Petr4 has equality for expressions of types in addition to the ones above:
Implicit cast insertion
Explicit cast
Declaration typing
In parallel to formalizing expression typing, I'm writing declaration typing since declared types are used by expressions.
Enums
Both of these are explained in issue #383.