tlc-pack / relax

Apache License 2.0
195 stars 58 forks source link

[Design] Use of `derive_func` for `PackedFunc` type checking and tracking purity #344

Open slyubomirsky opened 1 year ago

slyubomirsky commented 1 year ago

The recent StructInfo changes have been very exciting because of the possibility of doing further analysis. Upon revisiting the draft specification in light of these changes, a couple of questions have come up:

  1. In the previous draft of the spec, I proposed tracking function purity in FuncType, since we need to know what is and is not pure to check the correctness of DataflowBlocks. Should we track purity in FuncStructInfo instead? Should it be only in FuncStructInfo or also in the type system? I think it might be reasonable to do that check only in StructInfo, but am curious for thoughts.
  2. derive_func was added to FuncStructInfo to allow for judging the StructInfo for a PackedFunc call. Not having derive_func in PackedFuncType made it difficult to write a type-checking rule for PackedFunc calls. Should we propagate derive_func to PackedFuncType as well? Perhaps the derive_func should be made a field on the ExternFunc node (we can decide what mechanism we will use to specify it in the parser).
  3. It might make sense to separate FuncStructInfo and PackedFuncStructInfo just as there is a FuncType and a PackedFuncType, as I noted in #273.
tqchen commented 1 year ago

Great thoughts for discussions.

The main reason why we had a PackedFuncType is that the original FuncType does not support opaque function. Ideally we should refactor FuncType to be able to do so, and we kept it as it is since it is also used by relay. We could bring a relax specific function type that unifies the two.

PackedFunc is the default way of cross function calls and in our VM, the invoke builtin also directly handles the case where the function object can either be a PackedFunc or a Closure(native relax Func), so from that pov having a single StructInfo and type is convenient.

Because we have rich information that can be checked already through struct info's derive_func, having another derive_func could be a source of duplication. One possibility is relying on checking in the func_info.derive_func for most of the cases. So we can catch most of the error like input type mismatch there. That does mean that from the static type pov we treat opaque function as opaque.

Agree purity could be an interesting part of info to check, the main thing to think about here is how to preserve that information across transformations(perhaps some form of annotation would be necessary).

slyubomirsky commented 1 year ago

The spec draft contains a proposal for how purity checking could work and I don't think it's that hard to do (just a field on the FuncType). The question is whether that's something we want to do. I think it's worth doing because it would allow us to check that DataflowBlocks do not contain side effects.

Yes, for the derive_func, I think we should just propagate the one from the StructInfo. There is no need for it to be different. I think we do need to propagate it because it's necessary to figure out the type of the result of calling the PackedFunc.

As for FuncType and PackedFuncType and the analogous in StructInfo, the reason I think it would be good to separate out PackedFuncStructInfo is that ordinary Relax functions work very differently from PackedFuncs. It makes it quite a bit more cumbersome to state the rules for FuncStructInfo because we have to deal with the two separate cases wherever it appears, especially due to the use of derive_func.