cuEquivariance is a math library that is a collective of low-level primitives and tensor ops to accelerate widely-used models, like DiffDock, MACE, Allegro and NEQUIP, based on equivariant neural networks.
Hi, I'm exploring the possibility of using cuEquivariance-Torch in a C++ environment, similar to how e3nn models can be exported via TorchScript. I have a few questions:
Can cuEquivariance modules be exported using TorchScript?
If it is technically feasible but not yet officially supported, are there any plans about it?
I attempted to use both torch.jit.script and torch.jit.trace. While the former raises errors, the latter produces warnings. If it is unexpected, I'll attach a minimal code to reproduce it. Before diving deeper into debugging, I wanted to confirm if there are any related development plans or known limitations.
I'm aware that TorchScript may be deprecated in the future. However, its replacement, torch.export (https://pytorch.org/docs/stable/export.html), is still marked as unstable, and I have no option.
Lastly, I've noticed that using torch.compile with static tensor shapes can nearly double the performance of e3nn. Could cuEquivariance achieve similar speedups with torch.compile, or is this approach less relevant given its use of optimized custom kernels?
Hi, I'm exploring the possibility of using cuEquivariance-Torch in a C++ environment, similar to how e3nn models can be exported via TorchScript. I have a few questions:
I attempted to use both
torch.jit.script
andtorch.jit.trace
. While the former raises errors, the latter produces warnings. If it is unexpected, I'll attach a minimal code to reproduce it. Before diving deeper into debugging, I wanted to confirm if there are any related development plans or known limitations.I'm aware that TorchScript may be deprecated in the future. However, its replacement, torch.export (https://pytorch.org/docs/stable/export.html), is still marked as unstable, and I have no option.
Lastly, I've noticed that using
torch.compile
with static tensor shapes can nearly double the performance of e3nn. Could cuEquivariance achieve similar speedups withtorch.compile
, or is this approach less relevant given its use of optimized custom kernels?Thanks in advance for your guidance and support!