Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators
251
stars
102
forks
source link
Add ckProfiler support for forward 3D convolutions with OUT element-wise operations. #1354
Open
andriy-ca opened 1 week ago
Added
grouped_conv_fwd_outelementop
operation to ckProfiler.The option enables performance profiling of 3D FWD convolutions on tensors with non-standard floating-point data types followed by scaling operation.
At this time, the following combinations of data types and operations can be profiled:
Support for profiling the following combinations is implemented, but CK currently does not instantiate corresponding instances:
Refer to grouped_convolution_forward_convscale.hpp and grouped_convolution_forward_convinvscale.hpp for all implementations that were instantiated.
Note. This PR includes changes proposed in #1326 .