LSSTDESC / NaMaster

A unified pseudo-Cl framework
BSD 3-Clause "New" or "Revised" License
55 stars 25 forks source link

Illegal hardware instruction #41

Closed DanielLenz closed 3 years ago

DanielLenz commented 5 years ago

I've run into an issue giving me an 'illegal hardware instruction' error.

processor       : 271
vendor_id       : GenuineIntel
cpu family      : 6
model           : 87
model name      : Intel(R) Xeon Phi(TM) CPU 7250 @ 1.40GHz
stepping        : 1
microcode       : 0x1b6
cpu MHz         : 1347.117
cache size      : 1024 KB
physical id     : 0
siblings        : 272
core id         : 73
cpu cores       : 68
apicid          : 295
initial apicid  : 295
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl est tm2 ssse3 fma cx16 xtpr pdcm sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch ring3mwait epb spec_ctrl ibpb_support fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms avx512f rdseed adx avx512pf avx512er avx512cd xsaveopt dtherm ida arat pln pts
bogomips        : 2793.83
clflush size    : 64
cache_alignment : 64
address sizes   : 46 bits physical, 48 bits virtual
power management:

I compiled the C-libraries with

./configure
  --enable-pic --enable-shared
  CC=icc CXX=icpc
  CFLAGS="-xCORE-AVX2 -axCORE-AVX512,MIC-AVX512 -std=c+11 -fPIC -I$LIBSHARP_INC -I$HEALPIX_INC -I$CFITSIO_INC -I$TACC_GSL_INC -I$TACC_FFTW3_INC"
  LDFLAGS="-L$LIBSHARP_LIB -L$HEALPIX_LIB -L$TACC_GSL_LIB -L$CFITSIO_LIB -L$TACC_FFTW3_LIB -L$TACC_GSL_LIB"

Any ideas?

damonge commented 5 years ago

@DanielLenz no idea. Have you tried @beckermr 's conda-forge package? (conda install -c conda-forge namaster)

DanielLenz commented 5 years ago

I'm using the Python distribution that's provided on the cluster instead of anaconda, and it's not an issue right now because it works perfectly fine on the SKX nodes.

Feel free to close this, we can always re-open if there are any new insights.

I've been using the conda-forge package locally on my laptop (Py3.7), and it works like a charm!

damonge commented 5 years ago

OK, great. Will leave open in case someone else encounters the same problem

joezuntz commented 5 years ago

In case this is still useful - illegal instruction errors happen when you compile code for one CPU architecture but then run it on another. This can happen either if you copy something from one machine to another, or if you are using a heterogenous cluster which has different types of CPU in it.

In the latter case, lowering optimization levels can help.

JonahDW commented 3 years ago

I'll just revive this instead of opening a new issue.

I'm experiencing the same issue when trying to runnmt.mask_apodization() in python. I'm using the conda-forge package (Py3.7). I'm using it on my desktop which just has a single CPU with 8 cores in it, so I am really not sure what could be causing this.

damonge commented 3 years ago

OK, thanks. Can you post the exact code you're running?

damonge commented 3 years ago

(also, it'd be good to check if this also happens if you install the code from pip, in case you've tried that)

JonahDW commented 3 years ago

The only line is the one where the error occurs:

mask = nmt.mask_apodization(mask.astype(float), aposize=1., apotype='Smooth')

Where mask is a boolean array of length 12288 (for NSIDE=32 map). I tried playing around but the pip installation was giving me trouble.

beckermr commented 3 years ago

I think this issue is now solved. Care to try again?

JonahDW commented 3 years ago

Had to fight with my conda environment for a bit, but it looks to be all working! Thanks @beckermr!

beckermr commented 3 years ago

yay!

@damonge we can close this one!

damonge commented 3 years ago

Awesome, thanks a lot!