Closed hobofan closed 3 years ago
Hi! Thanks for reporting. There is some weirdness going on with the TorchScript models, they seem very prone to breaking between PyTorch versions (related). And always have (to me) very cryptic errors like the one you reported.
Tract has evolved quite a bit recently. It should be possible to replace tch-rs with tract now and replace PyTorch with onnxruntime, that way we can completely get rid of the TorchScript exports. Should also come with some speedup.
If you don't mind I'll hijack this issue to track this. I hope to get it done this weekend.
Nice! Saw that you also contributed to Tract, but thought suggesting to swap out the backend would be a bit much of an ask :sweat_smile:
There's one more thing I hadn't considered earlier today: switching to tract as backend would mean committing to not supporting GPU inference in Rust. That's not ideal but for me it would not be a problem. If there is demand we could still switch to onnxruntime Rust bindings once there are reasonably good ones (still very basic at the moment from what I've seen).
Just thought it would be good to note here - let me know if you have any concerns, otherwise I'll go ahead with changing the backend to tract / onnxruntime for Rust / Python respectively. Note that Python will still support GPU inference through onnxruntime-gpu.
Ok, as of release 0.4.1 ONNX models are used everywhere with tract as backend for Rust and onnxruntime as backend for Python! I'm really happy about this change.
When I started working on NNSplit it was simply not possible to use the same model for Rust, Python and Javascript but now that tract has gotten better and there's tractjs it works quite seamlessly :)
I've done a quick, approximate benchmark too: I see around 100% speed increase on CPU and 20% on GPU from using onnxruntime in the Python bindings.
The Rust library works great now!
Thanks so much for the quick turnaround! :)
Good to hear! To be honest I hadn't considered it was possible to completely switch to ONNX before this issue, but I was aware of the TorchScript problems. So this issue was a nice little push to make the switch :)
Hey, there! I was trying to run the Rust example from the README, but got the following error on a
cargo run
:Let me know it there is any more info you need for debugging!