Wasm-DSL / spectec

Wasm SpecTec specification tools
https://wasm-dsl.github.io/spectec/
Other
25 stars 9 forks source link

Using IL subtyping in IL to AL translation #108

Closed ShinWonho closed 1 month ago

ShinWonho commented 1 month ago

Hello, nowadays I'm trying to reduce the hardcoded cases in IL to AL translation. There are some cases that translate VarE by looking at its name.

let rec translate_rhs exp =
  match exp.it with
  ...
  | Il.VarE id | Il.SubE ({ it = VarE id; _ }, _, _)
    when String.starts_with ~prefix:"instr" id.it ->
      [ executeI (translate_exp exp) ~at:exp.at ]

I'd like to use Il.Eval.sub_typ function like below code,

let rec translate_rhs exp =
  match exp.it with
  ...
  | _ when Il.Eval.sub_typ env exp (VarT ("instr" $ no_region, []) $ no_region) ->
      [ executeI (translate_exp exp) ~at:exp.at ]

but it seems that env information is lost after the validation. Are there any ways to retrieve the environment?

rossberg commented 1 month ago

The environment depends on the current scope, so cannot be reconstructed without context.

Storing the local environment in AST nodes would not only be very costly but also highly problematic: then all middlend phases would have to modify all these environments when inserting/removing variables or moving expressions around.

The normal way of dealing with such context information is that a type driven translator maintains a suitable environment itself.

Can you give an example of a rule where this information is needed? Looking at translate_rhs, I see no cases for when these side conditions do not hold, other than causing the translation to fail. What would go wrong if they were simply removed?

ShinWonho commented 1 month ago

It works well now, but many translation cases are not general enough. I'll give a more concrete example.

There is a translation rule that results in execute instruction when it is just an iterator with option.

let rec translate_rhs exp =
  match exp.it with
  ...
  | Il.IterE (_, (Opt, _)) ->
    ...
    executeI (...)

This case was for the rhs s'; f; instr_E* instr_D* (CALL x)? in instantiation, but it is not general enough. If there is an rhs s; f; ref?, resulting AL would be an execute instruction rather than a push instruction.

An IL expression whose type is ref, vec, or other value type should be translated into push instructions, while others should be execute instructions. So, what I want to do is to generalize the translation logic by using subtype. If an exp is subtype of val, translate it to push. Else if it is subtype of instr, translate it to execute.

let rec translate_rhs exp =
  match exp.it with
  ...
  | _ when Il.Eval.sub_typ env exp (VarT ("val" $ no_region, []) $ no_region) ->
    ...
    pushI expr
  | _ when Il.Eval.sub_typ env exp (VarT ("instr" $ no_region, []) $ no_region) ->
    ...
    executeI expr

If storing environment is not a good idea, maybe we can hardcode subtype rules in translation.

rossberg commented 1 month ago

I see. But note that val is itself a subtype of instr, so I think the subtyping test on instr as in the second condition is not meaningful — in fact it should always be true, since all RHSs are instruction sequences.

But I understand that you need to recognise values specially. And I assume this cannot be done syntactically, since there may be cases where the value is the result of a function call, not a syntactic constructor you can recognise locally.

I'm afraid the only clean way to implement this is by biting the bullet and maintaining a typing environment in the translator. This is the Right Thing To Do, since it is a type-driven translation. To be honest, it is quite surprising that we did get as far without that. I expect that you will need that for other things as well.

FWIW, to be robust, the translator should also avoid hardcoding type names like "val" and "instr". Rather, there should be hints for telling the backend what types define values and instructions, and possibly other relevant types like store, frames that it needs to recognise.

ShinWonho commented 1 month ago

Here, RHS could be config (with mutating the state), context such as frame/label, or other things which might require more than just push or execute instruction, so the second condition is needed.

Anyway, I understand the situation. I'll work on building environment in the translator and using hints for values and instructions.

Thanks for the comments!