Open jeffry1829 opened 1 month ago
What did you compare them to, the CPU version? How large is the input tensor? For inspecting the reason, myebe the NVDIA profiler can help.
GPU Det and Directsum are ridiculously slow
Det uses cusolver ?getrf
Currently not sure whether this only happens to these two methods
Are you benchmarking against CPU version? Or old magma version?
I believe this issue was due to the fact that our DGX II has been hacked. Should perform the benchmark on some other machines.
GPU Det and Directsum are ridiculously slow
Det uses cusolver ?getrf
Currently not sure whether this only happens to these two methods