Closed jonatanklosko closed 2 months ago
Our torchx is also old. Those may be fixed if we update it.
One VM crash was related to specific dot product with u32:
u = Nx.tensor([[[1]], [[2]]])
v = Nx.tensor([[[3]], [[4]]])
Nx.dot(u, [2], [0], v, [2], [0])
I changed the default libtorch version from 2.0.0 to 2.1.0, and it's fixed.
The other crashes I fixed by casing in appropriate places.
I noticed that Nx.tensor(0xFFFFFFFF)
now crashes the VM with:
libc++abi: terminating due to uncaught exception of type std::runtime_error: value cannot be converted to type int without overflow
It was already the case before with Nx.s32(0xFFFFFFFF)
, it's just that now it's more likely to happen by default.
Perhaps there's a way to catch it and raise an elixir error instead, but that's not related to this PR.
@polvalente feel free to merge, if you are ok with the torchx changes :)
Nx and EXLA passes, but there are Torchx segfaults that I need to debug.