RenderKit / ospray

An Open, Scalable, Portable, Ray Tracing Based Rendering Engine for High-Fidelity Visualization
http://ospray.org
Apache License 2.0
1.01k stars 182 forks source link

Is this CPU supported ? #521

Closed nyue closed 2 years ago

nyue commented 2 years ago

While debugging #518 I was also building ospray on a laptop.

On that laptop (Dell Latitude E6410), when running ospray, I get the illegal instructions (core dump) error, this is running plain ospray without any MPI involvement

The CPU information for the laptop (via Ubuntu's lscpu command) is

"Illegal instructions error"

Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   36 bits physical, 48 bits virtual
CPU(s):                          4
On-line CPU(s) list:             0-3
Thread(s) per core:              2
Core(s) per socket:              2
Socket(s):                       1
NUMA node(s):                    1
Vendor ID:                       GenuineIntel
CPU family:                      6
Model:                           37
Model name:                      Intel(R) Core(TM) i5 CPU       M 520  @ 2.40GHz
Stepping:                        2
Frequency boost:                 enabled
CPU MHz:                         1197.018
CPU max MHz:                     2400.0000
CPU min MHz:                     1199.0000
BogoMIPS:                        4787.84
Virtualization:                  VT-x
L1d cache:                       64 KiB
L1i cache:                       64 KiB
L2 cache:                        512 KiB
L3 cache:                        3 MiB
NUMA node0 CPU(s):               0-3
Vulnerability Itlb multihit:     KVM: Mitigation: VMX disabled
Vulnerability L1tf:              Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
Vulnerability Mds:               Vulnerable: Clear CPU buffers attempted, no microcode; SMT vulnerable
Vulnerability Meltdown:          Mitigation; PTI
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:        Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP conditional, RSB filling
Vulnerability Srbds:             Not affected
Vulnerability Tsx async abort:   Not affected
Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 popcnt aes lahf_lm pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid dtherm ida arat flush_l1d

As a reference, my executing of ospray/ospray_studio/openmpi was successful on a HP box (small form factor)

"Working fine"

Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   36 bits physical, 48 bits virtual
CPU(s):                          4
On-line CPU(s) list:             0-3
Thread(s) per core:              1
Core(s) per socket:              4
Socket(s):                       1
NUMA node(s):                    1
Vendor ID:                       GenuineIntel
CPU family:                      6
Model:                           42
Model name:                      Intel(R) Core(TM) i5-2400S CPU @ 2.50GHz
Stepping:                        7
CPU MHz:                         2593.674
CPU max MHz:                     3300.0000
CPU min MHz:                     1600.0000
BogoMIPS:                        4987.83
L1d cache:                       128 KiB
L1i cache:                       128 KiB
L2 cache:                        1 MiB
L3 cache:                        6 MiB
NUMA node0 CPU(s):               0-3
Vulnerability Itlb multihit:     KVM: Mitigation: VMX unsupported
Vulnerability L1tf:              Mitigation; PTE Inversion
Vulnerability Mds:               Mitigation; Clear CPU buffers; SMT disabled
Vulnerability Meltdown:          Mitigation; PTI
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:        Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP disabled, RSB filling
Vulnerability Srbds:             Not affected
Vulnerability Tsx async abort:   Not affected
Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm epb pti ssbd ibrs ibpb stibp xsaveopt dtherm ida arat pln pts md_clear flush_l1d
johguenther commented 2 years ago

OSPRay needs as minimum ISA for x86 CPUs SSE4.1, which the Dell laptop supports (lscpu Flags include sse4_1). Did you enable SSE4 (or ALL) in CMake variable OSPRAY_BUILD_ISA? Do the binary release packages run?

nyue commented 2 years ago

The binary release packages runs fine.

I am trying to build the latest to validate (on said Dell laptop) the various MPI fixes.

OSPRAY_BUILD_ISA was already set to ALL

I also rebuild ospray with OSPRAY_BUILD_ISA set to SSE4, I still have the illegal instructions error

I am building on Ubuntu 20.04, gcc 9.4.0

$env LD_LIBRARY_PATH=~/systems/openvkl/1.2.0/lib:~/systems/embree/3.13.3/lib:~/systems/tbb/2021.5.0/lib:~/systems/rkcommon/1.9.0/lib ~/systems/ospray/devel/bin/ospExamples --osp:debug
Embree Ray Tracing Kernels 3.13.3 ()
  Compiler  : GCC 9.4.0
  Build     : Release 
  Platform  : Linux (64bit)
  CPU       : Nehalem (GenuineIntel)
   Threads  : 4
   ISA      : XMM SSE SSE2 SSE3 SSSE3 SSE4.1 SSE4.2 POPCNT 
   Targets  : SSE SSE2 SSE3 SSSE3 SSE4.1 SSE4.2 
   MXCSR    : FTZ=1, DAZ=1
  Config
    Threads : default
    ISA     : XMM SSE SSE2 SSE3 SSSE3 SSE4.1 SSE4.2 POPCNT 
    Targets : SSE SSE2 SSE3 SSSE3 SSE4.1 SSE4.2  (supported)
              SSE2 SSE4.2 AVX AVX2 AVX512  (compile time enabled)
    Features: intersection_filter 
    Tasking : TBB2021.5 TBB_header_interface_12050 TBB_lib_interface_12050 

general:
  build threads      = 0
  build user threads = 0
  start_threads      = 0
  affinity           = 0
  frequency_level    = simd128
  hugepages          = enabled
  verbosity          = 2
  cache_size         = 134.218 MB
  max_spatial_split_replications = 1.2
triangles:
  accel              = default
  builder            = default
  traverser          = default
motion blur triangles:
  accel              = default
  builder            = default
  traverser          = default
quads:
  accel              = default
  builder            = default
  traverser          = default
motion blur quads:
  accel              = default
  builder            = default
  traverser          = default
line segments:
  accel              = default
  builder            = default
  traverser          = default
motion blur line segments:
  accel              = default
  builder            = default
  traverser          = default
hair:
  accel              = default
  builder            = default
  traverser          = default
motion blur hair:
  accel              = default
  builder            = default
  traverser          = default
subdivision surfaces:
  accel              = default
grids:
  accel              = default
  builder            = default
motion blur grids:
  accel              = default
  builder            = default
object_accel:
  min_leaf_size      = 1
  max_leaf_size      = 1
object_accel_mb:
  min_leaf_size      = 1
  max_leaf_size      = 1
nyue commented 2 years ago

I noticed this in the --osp:debug output

SSE2 SSE4.2 AVX AVX2 AVX512  (compile time enabled)

On the Dell laptop, lscpu flags does not show AVX. Would that be a likely source of problem ?

johguenther commented 2 years ago

The important line is

Targets : SSE SSE2 SSE3 SSSE3 SSE4.1 SSE4.2 (supported)

which looks good. Also thank you for confirming that the binary packets work. Now I can think of two possibilities

Can you find out in which part or function the error happens (e.g. a stack trace with a debugger)?

nyue commented 2 years ago

Thanks @johguenther

Thanks for the tip about gdb+core-dump

I have found the cause, openvkl 1.2.0 defaults to enabling AVX and friends.

I have disabled them all (for my hardware) and rebuild everything.

ospray is running now.

Cheers