plasma-umass / Mesh

A memory allocator that automatically reduces the memory footprint of C/C++ applications.
Apache License 2.0
1.75k stars 73 forks source link

Tests are failed on Intel(R) Pentium(R) CPU G4600 #86

Open threadedstream opened 3 years ago

threadedstream commented 3 years ago

I presume that issue is processor-specific, as i just ran it on my laptop equipped with AMD Ryzen 3 3200U and quite surprisingly(maybe not) tests got passed. Here are some logs i end up getting:

INFO: Analyzed target //src:unit-tests (0 packages loaded, 0 targets configured).
INFO: Found 1 test target...
FAIL: //src:unit-tests (see /home/glasser/.cache/bazel/_bazel_glasser/90ad5b34d6bf5ba4db55b6ca0ce1ca6a/execroot/org_libmesh/bazel-out/k8-fastbuild/testlogs/src/unit-tests/test.log)
INFO: From Testing //src:unit-tests:
==================== Test output for //src:unit-tests:
Running main() from gmock_main.cc
[==========] Running 23 tests from 7 test suites.
[----------] Global test environment set-up.
[----------] 2 tests from Alignment
[ RUN      ] Alignment.NaturalAlignment
================================================================================
Target //src:unit-tests up-to-date:
  bazel-bin/src/unit-tests
INFO: Elapsed time: 0.313s, Critical Path: 0.16s
INFO: 2 processes: 2 linux-sandbox.
INFO: Build completed, 1 test FAILED, 2 total actions
//src:unit-tests                                                         FAILED in 0.1s
  /home/glasser/.cache/bazel/_bazel_glasser/90ad5b34d6bf5ba4db55b6ca0ce1ca6a/execroot/org_libmesh/bazel-out/k8-fastbuild/testlogs/src/unit-tests/test.log

INFO: Build completed, 1 test FAILED, 2 total actions
make: *** [Makefile:44: test] Error 3

Here, in logs, i was advised to look into /home/glasser/.cache/bazel/_bazel_glasser/90ad5b34d6bf5ba4db55b6ca0ce1ca6a/execroot/org_libmesh/bazel-out/k8-fastbuild/testlogs/src/unit-tests/test.log.

Here's an output of the latter

exec ${PAGER:-/usr/bin/less} "$0" || exit 1
Executing tests from //src:unit-tests
-----------------------------------------------------------------------------
Running main() from gmock_main.cc
[==========] Running 23 tests from 7 test suites.
[----------] Global test environment set-up.
[----------] 2 tests from Alignment
[ RUN      ] Alignment.NaturalAlignment

Thanks in advance!

bpowers commented 3 years ago

hi @ThreadedStream ! It sounds like it might be processor specific, but its surprising given thats a kaby lake chip.

I would suggest trying to run the unit test binary directly under gdb, like:

$ gdb ./bazel-bin/src/unit-tests

and then when gdb opens running run --gtest_filter='Alignment.*'.

Can you also share what operating system/kernel you are using? e.g.

$ uname -a
Linux bobbylaptop 5.12.9 #222 SMP PREEMPT Fri Jun 4 16:34:28 PDT 2021 x86_64 x86_64 x86_64 GNU/Linux

thanks!

threadedstream commented 3 years ago

Thank you very much for a feedback, @bpowers. Well, i hopped right into gdb and ran a command mentioned in your comment. It turned out that issue was hiding in ShuffleVector's constructor, which contains an initialization of _prng. An exception was pointing to address 0x000055555558fdf5. Then i decided to dig even deeper to thoroughly investigate the problem, so i disassembled the _ZN4mesh13ShuffleVectorC2Ev (a mangled name of routine in gdb) and found that problematic instruction was vxorps %xmm0,%xmm0,%xmm0. As per intel docs, the instruction computes the bitwise XOR of packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst. Obviously it also has a corresponding intrinsic __m256 _mm256_xor_ps (__m256 a, __m256 b), albeit it's irrelevant now. If i'm not mistaken, the instruction requires avx-512 extension, which, i presume, isn't supported by my processor. Can that be an issue? P.S here's an output of uname -a: Linux threadedstream 5.8.0-55-generic #62~20.04.1-Ubuntu SMP Wed Jun 2 08:55:04 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux.

pettyalex commented 2 years ago

I realize this is a year old issue, but I just watched https://www.youtube.com/watch?v=XRAP3lBivYM and had my mind blown and was checking on the health of this project.

@threadedstream your Pentium G4600 does not have AVX or AVX2, and the default compile flags for Mesh look to build with AVX instructions: https://github.com/plasma-umass/Mesh/blob/master/.bazelrc#L19

You could tell Bazel to build Mesh without AVX instructions to make this more portable, and this is actually something that I'd expect applications deploying an application to the general public would need to do. Intel's cheaper SKUs, including both big-core based chips like your Kaby Lake Pentium and the atom-based architectures like the Pentium Silver N6000 lack AVX/AVX2 support entirely, so with the current default build settings there's a whole host of recent Intel CPUs that this won't run on.