ROCm / MIOpen

AMD's Machine Intelligence Library
https://rocm.docs.amd.com/projects/MIOpen/en/latest/
Other
1.06k stars 221 forks source link

Re-enable or remove naive ASM convolution kernel #2406

Open atamazov opened 1 year ago

atamazov commented 1 year ago

The problem is stated in comments here: https://github.com/ROCmSoftwarePlatform/MIOpen/blob/71f159cbcf300ed88b3c28566c3d0b76d54a1eda/src/solver/conv_direct_naive_conv.cpp#L179-L194

The root or the problem is that the kernel was silently disabled (see https://github.com/ROCmSoftwarePlatform/MIOpen/pull/1544/files?diff=split&w=1#r1332182722).

After the kernel was turned off, its breakage went unnoticed.


[Attribution] @junliume @JehandadKhan

atamazov commented 1 year ago

Note that the logic around ConvDirectNaiveConvIsApplicableByKernelType() is wrong. If ASM kernels are disabled, then HIP kernel should be used.

aska-0096 commented 1 year ago

Hi @atamazov How could I reproduce the problem? If remove the .s file can solve the issue, I'll do it.

atamazov commented 1 year ago

@aska-0096 at first a decision must be made: do we want to remove the kernel or re-enable it.

Note that removal of .s file is not the same as removing the assembly kernel from the library. Removal of kernel also includes deletion of some code that is intended to handling the ASM kernel (and now dead). For example, some code from ConvDirectNaiveConvCompileOption() should be removed etc.

/cc @junliume @JehandadKhan