Open jerryyin opened 3 years ago
Right now there is a shortcut to all three fwd, bwd, wrw directions,
There is no shortcut for WrW.
if the config ... is supported winograd, we prioritize using this algorithm.
"Winograd 3x3 is Fast" means that GPU utilization is about 200%. Our assumption is that for these cases Winograd will be faster that any other Solver. That is why we skip all other Solvers. Will it remain valid in the future? Or we can expect outstanding performance from MLIR kernels?
Most solvers support NCHW only
bool ConvolutionDescriptor::IsWinograd3x3SupportedAndFast(miopen::ConvolutionContext& ctx) const
{
...
return solver::ConvBinWinograd3x3U{}.IsApplicable(ctx);
}
IsApplicable()
returns false
for non-NCHW, so the shortcut does NOT affect NHWC etc.
There is no shortcut for WrW.
Thanks for clarifying.
Or we can expect outstanding performance from MLIR kernels?
Thanks for providing the detailed performance metric. I can't answer this yet with MLIR solvers still in immature state. I'd be happy to give it a try. I'd say it worth a shot for MLIR xdlops solvers.
IsApplicable() returns false for non-NCHW, so the shortcut does NOT affect NHWC etc.
That's right. I think I mis-remembered, I can confirm NHWC not affected. Now it is about fwd/bwd path in NCHW format.
@jerryyin Please check if this is still an issue with ROCm 6.0.2? If not, please close ticket. Thanks!
From the comment in https://github.com/ROCmSoftwarePlatform/MIOpen/pull/905.
Right now there is a shortcut to all three fwd, bwd, wrw directions, such that if the config is 3x3 and is supported winograd, we prioritize using this algorithm. This is fine as long as:
The above condition won't stand when MLIR algorithm becomes a native solution. Therefore, filing this ticket to track and re-evaluate by then.