Closed Klazkin closed 3 months ago
Implement GPU-accelerated model inference using the CUDA build of ORT.
Implement cuda and validate its impact on performance
Time Estimate: 2 hours 0 minutes Time spent: 1 hours 30 minutes
2 hours 0 minutes
1 hours 30 minutes
DirectML Example https://shalvamist.github.io/onnxruntime/docs/execution-providers/DirectML-ExecutionProvider.html CUDA Example https://github.com/leimao/ONNX-Runtime-Inference/blob/main/src/inference.cpp ORT Execution Providers https://shalvamist.github.io/onnxruntime/docs/execution-providers/
Implement GPU-accelerated model inference using the CUDA build of ORT.
The goal
Implement cuda and validate its impact on performance
Time tracking
Time Estimate:
2 hours 0 minutes
Time spent:1 hours 30 minutes
Resources
DirectML Example https://shalvamist.github.io/onnxruntime/docs/execution-providers/DirectML-ExecutionProvider.html CUDA Example https://github.com/leimao/ONNX-Runtime-Inference/blob/main/src/inference.cpp ORT Execution Providers https://shalvamist.github.io/onnxruntime/docs/execution-providers/