Open Devjiu opened 1 day ago
Hey @Devjiu thanks for the report. Indeed, when calling the extended execute function, without post-op nor conversion, the C matrix is not read (it seems that internally, it reads D and accumulate in it in that situation).
You can workaround the issue by using the minimal version of execute (without D matrix) when no post-op nor data conversion is needed.
On oneDNN side, we will clarify this behavior in documentation. We can likely add internal checks to avoid this issue for future users.
Summary
During modification of initial example for brgemm ukernel usage I removed post-ops from configuration but kept
execution
call with default value of post-ops parameters. In this configuration accumulation buffer is ignored and don't passed into output buffer. In general, issue is behavior of this execute with default attributes don't match with behavior of raw brgemm kernel's execute.Version
onednn_verbose,v1,info,oneDNN v3.7.0 (commit 2f8619b0b9bb64e75a8da8f49f70ae226f841bbc) onednn_verbose,v1,info,cpu,runtime:OpenMP,nthr:128 onednn_verbose,v1,info,cpu,isa:Intel AVX-512 with Intel DL Boost onednn_verbose,v1,info,gpu,runtime:none onednn_verbose,v1,info,graph,backend,0:dnnl_backend
Environment
oneDNN includes hardware-specific optimizations and may behave differently on depending on the compiler and build environment. Include the following information to help reproduce the issue:
CMake output log
-- CMAKE_BUILD_TYPE is unset, defaulting to Release -- The C compiler identification is GNU 11.4.0 -- The CXX compiler identification is GNU 11.4.0 -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: /usr/bin/cc - skipped -- Detecting C compile features -- Detecting C compile features - done -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /usr/bin/c++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- DNNL_TARGET_ARCH: X64 -- DNNL_LIBRARY_NAME: dnnl -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success -- Found Threads: TRUE -- Found OpenMP_C: -fopenmp (found version "4.5") -- Found OpenMP_CXX: -fopenmp (found version "4.5") -- Found OpenMP: TRUE (found version "4.5") -- Could NOT find Doxygen (missing: DOXYGEN_EXECUTABLE) -- Could NOT find Doxyrest (missing: DOXYREST_EXECUTABLE) CMake Warning (dev) at cmake/Sphinx.cmake:25 (find_package): Policy CMP0148 is not set: The FindPythonInterp and FindPythonLibs modules are removed. Run "cmake --help-policy CMP0148" for policy details. Use the cmake_policy command to set the policy and suppress this warning.
Call Stack (most recent call first): cmake/doc.cmake:28 (include) CMakeLists.txt:135 (include) This warning is for project developers. Use -Wno-dev to suppress it.
-- Found PythonInterp: .venv/bin/python (found suitable version "3.10.12", minimum required is "2.7") -- Could NOT find Sphinx (missing: SPHINX_EXECUTABLE) -- Found Git: /usr/bin/git (found version "2.34.1") -- Enabled testing coverage: CI -- Enabled workload: TRAINING -- Enabled primitives: ALL -- Enabled primitive CPU ISA: ALL -- Enabled primitive GPU ISA: ALL -- Enabled GeMM kernels ISA: ALL -- Primitive cache is enabled -- Experimental functionality for ukernels is enabled -- The ASM compiler identification is GNU -- Found assembler: /usr/bin/cc -- Graph component is enabled -- Graph compiler backend is disabled. -- Configuring done (0.8s) -- Generating done (0.4s) -- Build files have been written to: oneDNN/build
based on 2f8619b0b9bb64e75a8da8f49f70ae226f841bbc
brg_po.execute(A_ptr, B_base_ptr, A_B_po_offsets, C_ptr, scratchpad.data());
brg_po.execute(A_ptr, B_base_ptr, A_B_po_offsets, C_ptr, D_mem.get_data_handle(), scratchpad.data());