Open tlively opened 1 month ago
Improving the validator this way would let us re-enable the unreached-invalid.wast spec test. @kripken, were you interested in taking a look at this?
I can look into it, sure.
A problem that happens here is that ChildTyper seems to assume valid IR already. For example:
getHeapType()
will assert if the type is unreachable, and getSignature()
will assert if it is a non-signature reference type.
I'm not sure offhand what kind of API change would make the most sense here, but doing that as a first step is necessary I think.
Since ChildTyper
only checks types based on information that would be included in the binary, my expectation is that other properties, including whether we can even produce such information, would be checked by the validator before it calls into ChildTyper
. Basically wherever we have a type annotation in the binary format, the validator will have to check that the type is valid before checking the child types more generally.
It seems hard to split things out that way, I'm afraid. CallRef
would need to see that the reference child is unreachable and therefore ignore it in ChildTyper. That is, it is valid for that child to be unreachable, so we can't just put a check before calling ChildTyper here. ChildTyper itself would need to do the check or at least I can't see a better way.
I would expect that when validating a CallRef
, the validator would only call into ChildTyper
if the ref is a typed function reference. Why is it hard to split out that way?
I guess nothing after it needs to be validated if the ref is not a signature, but I'm not sure that's the case elsewhere - in general we may have one child be unreachable and need to ignore it, but not another.
Also, that would mean that the decision to call the ChildTyper must be done in visitCallRef
or such, that is, it is specific to each expression class. That means we can't do it in a more generic location which would have been a lot simpler/shorter.
Ah, as an example for the first paragraph in the last comment, see StringEncode:
If the array is unreachable we still need to reach the last line that validates the i32 type of another child.
Also, that would mean that the decision to call the ChildTyper must be done in
visitCallRef
or such, that is, it is specific to each expression class. That means we can't do it in a more generic location which would have been a lot simpler/shorter.
Yes, it would be an annoyingly large PR. An alternative might be to have ChildTyper
detect when there's not enough information to type some children and fall back to some more general constraint only for those particular children. So in the example above, we could leave array
unconstrained by still ensure that str
and start
have the correct types.
The validator uses a large amount of code to check that child expressions have the correct types. Writing this code is a very manual process, and bugs are hard to catch because they manifest as the validator being overly permissive. We can delete a large amount of validator code and make it easier to get full validator coverage by using the new
ChildTyper
utility from the validator. We should also add more comprehensive validation tests, ideally as upstream spec tests (see #6612)