Closed alexdhill closed 1 year ago
Hi,
thanks for reporting. It seems the problem is related to -mavx
flag, at least on my Ubuntu VM removing this flag solves the issue.
Could you try this on your machine and let me know?
I think we will need to remove this flag in the next release.
Could you also please give me the output of lscpu
on your machine?
Removing -mavx
worked, I got all execs built and run on our system.
lscpu
returns:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 44 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 64
On-line CPU(s) list: 0-63
Vendor ID: GenuineIntel
Model name: Intel(R) Xeon(R) CPU X7560 @ 2.27GHz
CPU family: 6
Model: 46
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 4
Stepping: 6
BogoMIPS: 4522.12
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid
aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 x2apic popcnt lahf_lm pti ssbd ibrs ibpb stibp dtherm ida flush_l1d
Caches (sum of all):
L1d: 1 MiB (32 instances)
L1i: 1 MiB (32 instances)
L2: 8 MiB (32 instances)
L3: 96 MiB (4 instances)
NUMA:
NUMA node(s): 4
NUMA node0 CPU(s): 0,4,8,12,16,20,24,28,32,36,40,44,48,52,56,60
NUMA node1 CPU(s): 1,5,9,13,17,21,25,29,33,37,41,45,49,53,57,61
NUMA node2 CPU(s): 2,6,10,14,18,22,26,30,34,38,42,46,50,54,58,62
NUMA node3 CPU(s): 3,7,11,15,19,23,27,31,35,39,43,47,51,55,59,63
Vulnerabilities:
Itlb multihit: KVM: Mitigation: VMX unsupported
L1tf: Mitigation; PTE Inversion
Mds: Vulnerable: Clear CPU buffers attempted, no microcode; SMT vulnerable
Meltdown: Mitigation; PTI
Mmio stale data: Unknown: No mitigations
Retbleed: Not affected
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2: Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected
Srbds: Not affected
Tsx async abort: Not affected
Great, thanks. We will need to test on our servers if it is relevant for performance, and if not, just remove this flag in the next release. Thank you again.
Context
Operating system: Ubuntu 22.04.2 LTS (GNU/Linux 5.15.0-67-generic x86_64)
Expected Behavior
Downloading the precompiled binaries or building from source should yeild executables that print the usage when executed.
Current Behavior
Downloaded or built executables throw SIGILL [Illegal instruction: (core dumped)] when called. Nomad gives error 'cannot find version number for satc'.
I have downloaded the procompiled binaries and cloned the source to build nomad, and in both cases all the compiled executables (satc, satc_merge, satc_dump, etc.) throw SIGILL errors [Illegal instructions: (core dumped)].
I found that reducing the optimization levels from -O3 has successfully built most of the executables, but the satc_merge and sig_anch files still throw errors.
Reproducing the issue
2a. Git clone the NOMAD or R-NOMAD into the VM -- or -- 2b. Download the binaries into the VM
Enter VM and verify that the kernel is using Linux 5.15.0-67-generic x86_64 using
uname -r
Execute nomad or any of the executables.
Potential Problem/Solution
However, I have recently tried running on another Ubuntu system (Ubuntu 22.04.2 LTS (GNU/Linux 5.15.0-60-generic x86_64)), and the files which cannot be run on my primary server are able to run with no issues. It appears that with the kernel verion 5.15.0-60 NOMAD runs correctly, but on verion 5.15.0-67 it does not.