Open girishnadiger-gep opened 2 years ago
For GPU inference, you would have to use onnxruntime-gpu
. I haven't necessarily used the onnx Bart models on GPU, but in the current shape and form, it wouldn't directly work.
Hi @sidsharma72 , I've tried that, using onnxruntime-gpu, facing the same issue. I agree that it wouldn't directly work. I'm working on this aspect and will contribute to this repo if I get something meaningful to work, on an adapted version of fast-Bart
Hi @sidsharma72 , I've tried that, using onnxruntime-gpu, facing the same issue. I agree that it wouldn't directly work. I'm working on this aspect and will contribute to this repo if I get something meaningful to work, on an adapted version of fast-Bart
Hi @girishnadiger-gep do you have any updates on the GPU inference? Thanks!
Hi @siddharth-sharma7 , You package is great and very easy to use, but I'm unable to figure out how to actually use CUDAExecutionProvider, and use gpu for inferencing. Whenever I provide the providers=['CUDAExecutionProvider'], the model is still not being loaded to gpu and inferencing still happens in cpu.