We found the backward weight convolution kernels will lead to errors when enable profiling for the ck invoker run() functions, which made the ck-based solver failed in MIOpen.
We have the following observation: the ck-profiling is enabled, the ck wrw kernel introduces the errors that cause the precision issue. The ck-profiling is disabled, the result is correct but the time will be 0 in this case.
This lead to a situation: we can either get the correct profiling information OR get the correct result from ck, but not both of them.
I have proposed a workaround: PR2770 in MIOpen to address this issue. It would be nice if CK can do some fix to let us get both the profiling info and result simultaneously.
We found the backward weight convolution kernels will lead to errors when enable profiling for the ck invoker run() functions, which made the ck-based solver failed in MIOpen.
We have the following observation: the ck-profiling is enabled, the ck wrw kernel introduces the errors that cause the precision issue. The ck-profiling is disabled, the result is correct but the time will be 0 in this case. This lead to a situation: we can either get the correct profiling information OR get the correct result from ck, but not both of them. I have proposed a workaround: PR2770 in MIOpen to address this issue. It would be nice if CK can do some fix to let us get both the profiling info and result simultaneously.