Open nortex opened 6 years ago
None of the output cited here seems to report any problems, is there no information in the .log
? The false positives of /error/
paths are to be expected.
This recipe builds fine on my cluster, have you ensured that there's enough disk space and memory to build it this wide, and what kind of OS/hardware are you on?
@zao My os is: CentOS 6.7, CPUinfo:
processor : 19
vendor_id : GenuineIntel
cpu family : 6
model : 62
model name : Intel(R) Xeon(R) CPU E5-2660 v2 @ 2.20GHz
stepping : 4
microcode : 1064
cpu MHz : 2199.988
cache size : 25600 KB
physical id : 1
siblings : 10
core id : 12
cpu cores : 10
apicid : 56
initial apicid : 56
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb xsaveopt pln pts dts tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms
bogomips : 4399.36
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
I tried to use it with different EB versions (3.3 and 3.6.1 , 3.6.2) with different --optarch flags but still the same...
I set up a CentOS 6.7 VM, painstakingly found a Python 2.7 good enough to install EasyBuild, and the recipe builds fine. Very odd.
Can you somehow upload the full log somewhere, and maybe look in dmesg
or /var/log/messages
for what process might've terminated surprisingly?
processor : 5
vendor_id : GenuineIntel
cpu family : 6
model : 158
model name : Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz
stepping : 10
cpu MHz : 3696.000
cache size : 12288 KB
physical id : 0
siblings : 6
core id : 5
cpu cores : 6
apicid : 5
initial apicid : 5
fpu : yes
fpu_exception : yes
cpuid level : 22
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good xtopology nonstop_tsc unfair_spinlock pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx rdrand hypervisor lahf_lm abm 3dnowprefetch fsgsbase avx2 invpcid rdseed
bogomips : 7392.00
clflush size : 64
cache_alignment : 64
address sizes : 39 bits physical, 48 bits virtual
power management:
I didn't find anything in /var/log/messages, here is something from dmesg:
CPU0: Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz stepping 02
Performance Events: PEBS fmt2+, 16-deep LBR, Haswell events, full-width counters, Broken BIOS detected, complain to your hardware vendor.
[Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 38d is 330)
Intel PMU driver.
... version: 3
... bit width: 48
... generic registers: 8
... value mask: 0000ffffffffffff
... max period: 0000ffffffffffff
... fixed-purpose events: 3
... event mask: 00000007000000ff
NMI watchdog enabled, takes one hw-pmu counter.
Booting Node 0, Processors #1
WARNING: polling idle and HT enabled, performance may degrade.
#2
Attched full log with debug with the crash. If you have any ideas why this could happen please inform me, as it really strange.
Seems like I don't know how to operate a CentOS machine, mine claims to be 6.10 after I got Python installed. 🙄
In any way, at the point where your log is cut off, it invokes a long build of the rest of the libraries.
The only things that come to mind to try right now would be to try to build a Boost outside of EasyBuild to see if the b2
terminates or crashes in some way, or to see if tuning down the parallelism of the build might affect things.
I assume that disk space is plenty in the build dir and temp dirs? Boost can be quite hungry.
@nortex Did you ever figure out anything more about this?
Hi all,
Every Boost version and toolchain that i try to compile using Easybuild, during the "build" stage i get broken pipe message that terminate my session. Running with --debug give the next error:
Any ideas what causing the segfault?
Thanks.