elixir-nx / nx

Multi-dimensional arrays (tensors) and numerical definitions for Elixir
2.66k stars 194 forks source link

Default integers to 32-bit precision #1524

Closed jonatanklosko closed 2 months ago

jonatanklosko commented 2 months ago

Nx and EXLA passes, but there are Torchx segfaults that I need to debug.

josevalim commented 2 months ago

Our torchx is also old. Those may be fixed if we update it.

jonatanklosko commented 2 months ago

One VM crash was related to specific dot product with u32:

u = Nx.tensor([[[1]], [[2]]])
v = Nx.tensor([[[3]], [[4]]])
Nx.dot(u, [2], [0], v, [2], [0])

I changed the default libtorch version from 2.0.0 to 2.1.0, and it's fixed.

The other crashes I fixed by casing in appropriate places.

jonatanklosko commented 2 months ago

I noticed that Nx.tensor(0xFFFFFFFF) now crashes the VM with:

libc++abi: terminating due to uncaught exception of type std::runtime_error: value cannot be converted to type int without overflow

It was already the case before with Nx.s32(0xFFFFFFFF), it's just that now it's more likely to happen by default.

Perhaps there's a way to catch it and raise an elixir error instead, but that's not related to this PR.

jonatanklosko commented 2 months ago

@polvalente feel free to merge, if you are ok with the torchx changes :)