tlc-pack / relax

Apache License 2.0
193 stars 58 forks source link

[Discuss][VM] Wrapping int, float, and string arguments in runtime containers #214

Open slyubomirsky opened 2 years ago

slyubomirsky commented 2 years ago

The VM module directly exposes PackedFuncs that can be executed with arbitrary arguments. This means users can pass ordinary Python ints, floats, and strings to them, which can lead to strange effects sometimes. For example, we might expect ints and floats to be treated like scalars, but doing something like this will result in an error:

from __future__ import annotations
import tvm
import tvm.script
from tvm import relax
from tvm.script import relax as R

@tvm.script.ir_module
class TestTuple:
    @R.function
    def main(x: Tensor((), "int32")):
        t = (x, x)
        return x

mod = TestTuple
target = tvm.target.Target("llvm", host="llvm")
ex = relax.vm.build(mod, target)
vm = relax.VirtualMachine(ex, tvm.cpu())

ret = vm["main"](1) # fails! You cannot put ordinary integers in a tuple
ret = vm["main"](tvm.nd.array(1)) # works

Another example, even stranger, deals with strings:

from __future__ import annotations
import tvm
import tvm.script
from tvm import relax
from tvm.script import relax as R

@tvm.script.ir_module
class UseString:
    @R.function
    def main(o: Object):
        return relax.print(o, format="My string: {}")

mod = UseString
target = tvm.target.Target("llvm", host="llvm")
ex = relax.vm.build(mod, target)
vm = relax.VirtualMachine(ex, tvm.cpu())

ret = vm["main"]("abc") # prints only "My string: "
ret = vm["main"](tvm.runtime.String("abc")) # works as expected

I am not entirely sure why the string contents are disregarded in this example, but it is certainly counterintuitive and unexpected. At least one reason for this is that for PackedFuncs, TVM strings (typecode kTVMStr) and runtime string containers (typecode kTVMObjectHandle) are treated differently; the first call has the string typecode while the second has the object typecode.

In the VM, we could, in principle, detect cases like ints, floats, and strings and convert them into runtime objects. (Another thing we can do is check for unsupported types and give an immediate error/warning.) What should be Relax's policy on this? Should anything passed in be a TVM Object? What about passing in a None value? This seems like a tricky area of the spec.

slyubomirsky commented 2 years ago

Another weird inconsistency (possibly a bug?): Branch conditions create an error if given an NDArray and work only with an int constant. Note that bool scalar constants created within a program also fail for some reason.

@tvm.script.ir_module
class TestBranch:
    @R.function
    def branch(cond: Tensor((), "bool"), x: Tensor((1,), "int32")) -> Tensor((1,), "int32"):
        if cond:
            ret = x
    else:
            ret = x
        return ret

    @R.function
    def branch_const(x: Tensor((1,), "int32")) -> Tensor((1,), "int32"):
        if relax.const(True, dtype="bool"):
            ret = x
        else:
            ret = x
        return x

mod =  TestBranch
target = tvm.target.Target("llvm", host="llvm")
ex = relax.vm.build(mod, target)
vm = relax.VirtualMachine(ex, tvm.cpu())

ret = vm["branch"](1, tvm.nd.array([1])) # succeeds
ret = vm["branch"](tvm.nd.array(True), tvm.nd.array([1])) # crashes, complaining about expecting an int but receiving an NDArrayContainer
ret = vm["branch_const"](tvm.nd.array([1])) # crashes, complaining even more strangely about receiving NULL

Edit: This has since been fixed, thanks to Yuchen's update.

sunggg commented 2 years ago

Great catch! Does it only happen in the Relax world, especially cases in the first thread? This might be the issue in the Relay VM side as well and might worth checking how they handled it. Personally, I think users should pass arguments that are TVM Objects or can be implicitly converted (we can also offer nice sugars around it). Regarding None, I'm not sure if there would be a real case that users might want to pass None - this seems an error to me. Wonder how others think.

YuchenJin commented 2 years ago

Great findings!

  1. TestTuple: ret = vm["main"](1) fails to execute because TVM containers do not support POD types (such as int and string) as its elements, here the ADT::runtime::Tuple expect TVM Objects as its elements but meet integer.
  2. UseString: The VM register is able to store any TVMRetValue, which include TVMPODValue and all TVM Objects. std::string is not supported in TVMRetValue, while runtime.String is supported since it's a TVM Object, and pod types like int and float are also supported. That's why ints and floats can be passed and printed with no problem, while string has to be converted to runtime.String.
slyubomirsky commented 2 years ago

Are there any docs on TVM's object system outside of the code base? This would be a good thing to clearly specify for users of Relax, especially since we are giving such direct access to PackedFuncs, etc

slyubomirsky commented 2 years ago

Regarding None, I'm not sure if there would be a real case that users might want to pass None - this seems an error to me. Wonder how others think.

None is a bit of a weird case, but it is essentially a value in our language--it's what is returned by a function that does side effects and has no return value. We treat it as the same thing as unit (empty tuple).

YuchenJin commented 2 years ago

Are there any docs on TVM's object system outside of the code base? This would be a good thing to clearly specify for users of Relax, especially since we are giving such direct access to PackedFuncs, etc

No such docs afaik. Agree that we should clearly specify. 👍 Let's write them to the language spec.