Open Artoriuz opened 4 months ago
i'll look into how easy it would be to add this
Motivation After installing ChaiNNer and ONNX Runtime, I don't see an option to run it on AMD/Intel GPUs, which I assume is because the DirectML Execution Provider isn't available.
Description It would be nice to have the option of using AMD/Intel GPUs with ONNX Runtime on ChaiNNer.
Alternatives Currently, the best alternative would be using NCNN, but support isn't 1:1 and some operations are unsupported. You can also run ORT on the CPU, but that's much slower depending on the model.
This helps with older AMD GPU (mine is: RTX 580), that hasn't ROCm support? (right now only supports just a few newer GPU).
Motivation After installing ChaiNNer and ONNX Runtime, I don't see an option to run it on AMD/Intel GPUs, which I assume is because the DirectML Execution Provider isn't available. Description It would be nice to have the option of using AMD/Intel GPUs with ONNX Runtime on ChaiNNer. Alternatives Currently, the best alternative would be using NCNN, but support isn't 1:1 and some operations are unsupported. You can also run ORT on the CPU, but that's much slower depending on the model.
This helps with older AMD GPU (mine is: RTX 580), that hasn't ROCm support? (right now only supports just a few newer GPU).
DirectML works without ROCm, it's Microsoft's solution to make ML on windows more well supported. It should work on any GPU with DX12 support.
Is this effort still ongoing? Is there any way to add it manually?
there is a pytorch directml provider as well (torch-directml) that could provide pytorch HW acceleration on any GPU in windows
Motivation After installing ChaiNNer and ONNX Runtime, I don't see an option to run it on AMD/Intel GPUs, which I assume is because the DirectML Execution Provider isn't available.
Description It would be nice to have the option of using AMD/Intel GPUs with ONNX Runtime on ChaiNNer.
Alternatives Currently, the best alternative would be using NCNN, but support isn't 1:1 and some operations are unsupported. You can also run ORT on the CPU, but that's much slower depending on the model.