Open novikov-alexander opened 1 month ago
Are there any plans to support inference on some accelerators?
Let's say use ONNXRuntime or TensorRT to free up CPU resources.
Are there any plans to support inference on some accelerators?
Let's say use ONNXRuntime or TensorRT to free up CPU resources.