pika-org / pika

pika is a C++ tasking library built on std::execution with fibers, CUDA, HIP, and MPI support.
https://pikacpp.org
Boost Software License 1.0
67 stars 10 forks source link

ctest or build failed on different architecture of Fedora #918

Open topazus opened 11 months ago

topazus commented 11 months ago

Expected Behavior

build and test successfully

Actual Behavior

Steps to Reproduce the Problem

Specifications

... Please describe your environment

msimberg commented 11 months ago

Thanks for these reports as well! I will say up front that supporting ppc and s390x is not particularly high priority for us, so if you're very interested in getting those to work, you may have to help us out a bit. That said, these look manageable.

topazus commented 11 months ago
  • Are you able to get the output of lstopo on the test systems?

The above builds is with Fedora Koji Build System. I can run lstopo command during the building of the package.

  • For the ppc failure, we may be misdetecting 128-bit atomics support. It looks like we detect that linking with -latomic isn't required, but it may be needed after all. Are you able to test if adding -latomic to CMAKE_CXX_FLAGS/CXXFLAGS works around the problem? If it does, we need to figure out why the feature detection doesn't fail without -latomic.

I think you mean s390x, which appeared the failures of -latomic? I will test this add -latomic to CMAKE_CXX_FLAGS.

msimberg commented 11 months ago

I think you mean s390x, which appeared the failures of -latomic? I will test this add -latomic to CMAKE_CXX_FLAGS.

Yep, indeed. Thanks!

topazus commented 11 months ago

Here is the results of lstopo on x86_64, but the results will vary at different times when building.

+ lstopo
Machine (126GB total)
  Package L#0
    NUMANode L#0 (P#0 63GB)
    L3 L#0 (30MB)
      L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0
        PU L#0 (P#0)
        PU L#1 (P#24)
      L2 L#1 (256KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1
        PU L#2 (P#2)
        PU L#3 (P#26)
      L2 L#2 (256KB) + L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2
        PU L#4 (P#4)
        PU L#5 (P#28)
      L2 L#3 (256KB) + L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3
        PU L#6 (P#6)
        PU L#7 (P#30)
      L2 L#4 (256KB) + L1d L#4 (32KB) + L1i L#4 (32KB) + Core L#4
        PU L#8 (P#8)
        PU L#9 (P#32)
      L2 L#5 (256KB) + L1d L#5 (32KB) + L1i L#5 (32KB) + Core L#5
        PU L#10 (P#10)
        PU L#11 (P#34)
      L2 L#6 (256KB) + L1d L#6 (32KB) + L1i L#6 (32KB) + Core L#6
        PU L#12 (P#12)
        PU L#13 (P#36)
      L2 L#7 (256KB) + L1d L#7 (32KB) + L1i L#7 (32KB) + Core L#7
        PU L#14 (P#14)
        PU L#15 (P#38)
      L2 L#8 (256KB) + L1d L#8 (32KB) + L1i L#8 (32KB) + Core L#8
        PU L#16 (P#16)
        PU L#17 (P#40)
      L2 L#9 (256KB) + L1d L#9 (32KB) + L1i L#9 (32KB) + Core L#9
        PU L#18 (P#18)
        PU L#19 (P#42)
      L2 L#10 (256KB) + L1d L#10 (32KB) + L1i L#10 (32KB) + Core L#10
        PU L#20 (P#20)
        PU L#21 (P#44)
      L2 L#11 (256KB) + L1d L#11 (32KB) + L1i L#11 (32KB) + Core L#11
        PU L#22 (P#22)
        PU L#23 (P#46)
    HostBridge
      PCIBridge
        2 x { PCI 01:00.0-1 (Ethernet) }
      PCI 00:11.4 (SATA)
        Block "sdb"
        Block "sda"
      PCIBridge
        PCIBridge
          PCIBridge
            PCIBridge
              PCI 0a:00.0 (VGA)
      PCI 00:1f.2 (SATA)
  Package L#1
    NUMANode L#1 (P#1 63GB)
    L3 L#1 (30MB)
      L2 L#12 (256KB) + L1d L#12 (32KB) + L1i L#12 (32KB) + Core L#12
        PU L#24 (P#1)
        PU L#25 (P#25)
      L2 L#13 (256KB) + L1d L#13 (32KB) + L1i L#13 (32KB) + Core L#13
        PU L#26 (P#3)
        PU L#27 (P#27)
      L2 L#14 (256KB) + L1d L#14 (32KB) + L1i L#14 (32KB) + Core L#14
        PU L#28 (P#5)
        PU L#29 (P#29)
      L2 L#15 (256KB) + L1d L#15 (32KB) + L1i L#15 (32KB) + Core L#15
        PU L#30 (P#7)
        PU L#31 (P#31)
      L2 L#16 (256KB) + L1d L#16 (32KB) + L1i L#16 (32KB) + Core L#16
        PU L#32 (P#9)
        PU L#33 (P#33)
      L2 L#17 (256KB) + L1d L#17 (32KB) + L1i L#17 (32KB) + Core L#17
        PU L#34 (P#11)
        PU L#35 (P#35)
      L2 L#18 (256KB) + L1d L#18 (32KB) + L1i L#18 (32KB) + Core L#18
        PU L#36 (P#13)
        PU L#37 (P#37)
      L2 L#19 (256KB) + L1d L#19 (32KB) + L1i L#19 (32KB) + Core L#19
        PU L#38 (P#15)
        PU L#39 (P#39)
      L2 L#20 (256KB) + L1d L#20 (32KB) + L1i L#20 (32KB) + Core L#20
        PU L#40 (P#17)
        PU L#41 (P#41)
      L2 L#21 (256KB) + L1d L#21 (32KB) + L1i L#21 (32KB) + Core L#21
        PU L#42 (P#19)
        PU L#43 (P#43)
      L2 L#22 (256KB) + L1d L#22 (32KB) + L1i L#22 (32KB) + Core L#22
        PU L#44 (P#21)
        PU L#45 (P#45)
      L2 L#23 (256KB) + L1d L#23 (32KB) + L1i L#23 (32KB) + Core L#23
        PU L#46 (P#23)
        PU L#47 (P#47)

build log: https://kojipkgs.fedoraproject.org//work/tasks/3892/111293892/build.log

add -latomic to CMAKE_CXX_FLAGS on s390x, see build log: https://kojipkgs.fedoraproject.org//work/tasks/3895/111293895/build.log

all builds on different architecture: https://koji.fedoraproject.org/koji/taskinfo?taskID=111293831

msimberg commented 11 months ago

Thanks for the logs.

Regarding the process_mask_flag test failure, I've attempted to relax the regex used for checking the test output in https://github.com/pika-org/pika/pull/922. The output of the test looks correct already, the checking is just overly strict.

Regarding the -latomic flag on s390x, it's good to hear that adding it fixes the compilation. If it's not a problem to leave the -latomic flag you can do so. However, for understanding why our detection fails is it possible for you to try to compile the file cmake/tests/cxx11_std_atomic_128bit.cpp (in the pika repo) with e.g. $CXX -std=c++17 -Werror cmake/tests/cxx11_std_atomic_128bit.cpp? This should fail to link into an executable (but compile into an object file with -c) without -latomic. This shouldn't be the case, but the try_compile should be using CMAKE_TRY_COMPILE_TARGET_TYPE set to EXECUTABLE. You could also try setting that expliclitly as a CMake option to see if it makes a difference (though this should already be the default behaviour).