Open secure-sw-dev-bot opened 2 years ago
Comment from @abeln:
@dtarditi what would the definition of a type variable look like with de Bruijn indices?
e.g. right now
_For_any(T) T *id(T *x) { return x; }
becomes
typedef TypeVariablType(0, 0) T
T *id(T* x) { return x; }
and all uses of T
can refer to it as TypeVariableType(0, 0)
.
However, if we switch to de Bruijn indices, then we can't say typedef TypeVariableType(0, 0) T
, because the index of T
can potentially change across multiple references:
e.g.
_For_any(T) T *id(T *x) {
_For_any(U) U *id2(T *y) { return y; } // the type of `y` should be `TypeVariableType(1, 0)`
id2<T>(x); // the type argument to id2 should be `TypeVariableType(0, 0)`
}
So how would we desugar _For_any(T)
here with a typedef?
Any ideas?
Comment from @abeln:
It also seems like de Bruijn indices violate the unicity of pointers that clang asumes for types. This is because the same type variable 'T' potentially needs to be referred to with different de Bruijn indices (depending on the number of bindings between the definition and use of T). This means that we'll end up with two pointers to the same underlying type, which violates unicity.
Comment from @abeln:
For existential types, one solution is to continue using levels for type variables as we do right now, but also introduce a new kind of type variable ExistentialTypeVariableType (abbreviated below as ETVT) that is used only for existentials.
And ETVT has only one parameter: depth (an int like a normal type variable), because existential variables only appear inside existential types, which can only bind one variable at a time.
e.g.
_Exists(T, IntSet<T>) => Exists(IntSet<ETVT(0)>)
this solves our problem because we won't confuse an existential variable with a regular variable:
_For_any(U) void foo() {
struct Pair<U, U> p = ...
pack struct _Exists(T, struct Pair<T, U>) absPair = p;
}
The type of 'absPair' above is Exists(Pair<ETVT<0>, TV<0, 0>>)
, so both kinds of variables are kept separate.
Whenever we wrap an existential with a new existential, we have to rename references to the existential variables:
(within foo)
pack struct _Exists(T, _Exists(U, struct Pair<T, U>)) absPair2 = absPair;
we'd represent the type of absPair2
as Exists(Exists(struct Pair<ETVT<1>, ETVT(0)>))
. Notice how the first argument went from ETVT<0>
to ETVT<1>
.
What about unpack
?
unpack (T, IntSet) ePair = absPair;
The type of absPair
was Exists(Pair<ETVT<0>, TV<0>>)
. The type of ePair
is Pair<TV<0, 0>, TV<1, 0>>
. So an unpack operation replaces all instances of ETVT<0>
by the user-introduced regular TV T
.
Comment from @abeln:
Spoke with @dtarditi about this today and we agreed that moving to de Brown indices might introduce more problems than it helps resolve, so we'll stay with the current approach of tracking depth.
Further, I'll focus on the existential types and we'll revisit this bug later if it becomes hi-pri.
This issue was copied from https://github.com/microsoft/checkedc-clang/issues/584
When comparing type variables, we use their lexical depths within _For_any quantifiers and their argument position for comparisons. For example, given
T has a lexical depth of 1 and S has a lexical depth of 2. This means T and S should compare differently.
In some cases, we aren't currently maintaining lexical depths properly. First, lexical depth counts from 0, not from 1. More importantly, if a generic function itself uses generic functions, we'll get the lexical depth wrong. This could lead to typechecking errors. Consider:
(we would need a comparison function argument too, which I'm omitting too keep the example simple).
In this case, U has a lexical depth of 2. However, we're not adjusting lexical depth for the type of sort wen it is used as the right-hand of the assignment. The use of sort will have a type where S has a lexical depth of 1, causing a type mismatch.
In general, when we
we may need to adjust the generic type so that the quantified type variable has the proper lexical depth. The depth calculation needs to take into account depth within types and expressions.
@prabhuk, this issue covers the problem I mentioned earlier today.
Note that clang's function scope depth is different from the definition of depth that we are proposing for type variables. It's not true lexical depth and there's no adjustment to properly track lexical depth. It actually only tracks depth within function parameter lists.
By the way, for people interested in deBruijn indices, with deBruijn indices we wouldn't have to renumber the quantified type variables at variable uses. However, we'd have multiple representations of a type variable, which could make the IR harder to understand.
With deBruijn indices, we count from the use of a type variable up to the scope that binds it. In this modified version of the first example the use of S would have an index of 0 and the use of _Ptr would have an index of 1: