The AMD GPUs might required fine tuning.
DaCe orchestrated touts good performance, but without large testing done outside of ML.
GT4Py has a bare bone CUDA cross-compile which might be a bit light on the integration for best performance.
We need to test when we can get the required gear.
The AMD GPUs might required fine tuning.
DaCe
orchestrated touts good performance, but without large testing done outside of ML.GT4Py
has a bare bone CUDA cross-compile which might be a bit light on the integration for best performance.We need to test when we can get the required gear.