Closed Sandglass-vic closed 1 year ago
Hi @Sandglass-vic
Please take a look at our reference implementation for gemm
in https://github.com/ARM-software/ComputeLibrary/blob/main/tests/validation/reference/GEMM.cpp#L80
Hope this helps
I think it has nothing to do with the implementation of reference gemm. I set M=N=K=4, alpha=0 and beta=1 using the following code:
// Configure function
sgemm.configure(&src0, &src1, &src2, &dst, 0, 1);
// Allocate all the images
src0.allocator()->allocate();
src1.allocator()->allocate();
src2.allocator()->allocate();
dst.allocator()->allocate();
for (int i = 0; i < M; ++i)
for (int j = 0; j < K; ++j)
*reinterpret_cast<float*>(
src0.buffer() +
src0.info()->offset_element_in_bytes(Coordinates(j, i))) = 0;
for (int i = 0; i < K; ++i)
for (int j = 0; j < N; ++j)
*reinterpret_cast<float*>(
src1.buffer() +
src1.info()->offset_element_in_bytes(Coordinates(j, i))) = 0;
for (int i = 0; i < M; ++i)
for (int j = 0; j < N; ++j)
*reinterpret_cast<float*>(
src2.buffer() +
src2.info()->offset_element_in_bytes(Coordinates(j, i))) = 1;
// Dummy run for CLTuner
sgemm.run();
printf("dst:\n");
dst.print(std::cout);
The output was
dst:
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
which was expected to be all ones. And I changed alpha to 1 and beta to zero,
// Configure function
sgemm.configure(&src0, &src1, &src2, &dst, 1, 0);
The output was
dst:
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
Hi @Sandglass-vic
Could you please try setting GEMMInfo
like this when you call configure
?
+ GEMMInfo gemm_info(false,false,false);
+ sgemm.configure(&src0, &src1, &src2, &dst, alpha, beta, gemm_info);
Hope this helps.
Yes, it helps. Thanks for your help! :)
Output of 'strings libarm_compute.so | grep arm_compute_version': arm_compute_version=v22.11 Build options: {'embed_kernels': '1', 'os': 'linux', 'opencl': '0', 'neon': '1', 'build_dir': 'arm64-v8.2-a-neon', 'asserts': '0', 'arch': 'arm64-v8.2-a', 'Werror': '1'} Git hash=unknown
Platform: armv8.2-a + neon + fp16
Operating System: CentOS Linux 7 (AltArch)
Problem description: I modified examples/neon_sgemm.cpp to check the correctness of alpha A B + beta C given any alpha and any beta with a naive implementation of sgemm as my reference: ``for (int j = 0; j < N; ++j) for (int i = 0; i < M; ++i) { float temp = 0.; for (int l = 0; l < K; ++l) temp += a[i + l M] b[l + j K]; c[i + j M] = alpha (c[i + j * M] + temp); }``
The configure function is
sgemm.configure(&src0, &src1, &src2, &dst, alpha, beta);
However, the result turns out to be alpha (A B + C) and changing beta doesn't influence the result. Is it a bug?