fireice-uk / xmr-stak

Free Monero RandomX Miner and unified CryptoNight miner
GNU General Public License v3.0
4.05k stars 1.79k forks source link

Illegal instruction (core dumped) after migrating from xmr-stak-cpu & -nvidia to xmr-stak #1177

Closed GSI closed 6 years ago

GSI commented 6 years ago

Successfully used xmr-stak-(cpu|nvidia) for some months with 3 GTX 1050.

Now, with xmr-stak the binary fails upon start. Note that it is compiled on another system and then used on the one in question.

Find below a verbose compilation of system information and coredump.

(Read some other issues - like #1096, #504, #348 and #44 - but they seem to refer to other things. The CPU also appears as AES-capable.)

xmr-stak --config /root/xmr-stak.json

Illegal instruction (core dumped)

uname -a

Linux linbox 4.15.10-1-ARCH #1 SMP PREEMPT Thu Mar 15 12:24:34 UTC 2018 x86_64 GNU/Linux

nvidia-smi

Fri Mar  9 17:55:26 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.42                 Driver Version: 390.42                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1050    Off  | 00000000:01:00.0 Off |                  N/A |
| 40%   24C    P0    N/A /  75W |      0MiB /  2000MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 1050    Off  | 00000000:04:00.0 Off |                  N/A |
| 40%   22C    P0    N/A /  75W |      0MiB /  2000MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce GTX 1050    Off  | 00000000:05:00.0 Off |                  N/A |
| 40%   21C    P0    N/A /  75W |      0MiB /  2000MiB |      2%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

cat /proc/cpuinfo

processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 22
model           : 0
model name      : AMD A4-5050 APU with Radeon(TM) HD Graphics
stepping        : 1
microcode       : 0x700010f
cpu MHz         : 961.279
cache size      : 2048 KB
physical id     : 0
siblings        : 4
core id         : 0
cpu cores       : 4
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_$
sc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_l$
gacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt topoext perfctr_nb bpext perfctr_llc hw_pstate proc_feedback vmmcall bmi1 xsaveopt arat npt lbrv svm_lock nrip_s$
ve tsc_scale flushbyasid decodeassists pausefilter pfthreshold overflow_recov
bugs            : fxsave_leak sysret_ss_attrs null_seg spectre_v1 spectre_v2
bogomips        : 3094.20
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate proc_feedback

xmr-stak --version-long

Version: xmr-stak/2.2.0/2ae7260/master/lin/nvidia-amd-cpu/aeon-monero/0

clinfo [261/704]

Number of platforms                               1
  Platform Name                                   NVIDIA CUDA
  Platform Vendor                                 NVIDIA Corporation
  Platform Version                                OpenCL 1.2 CUDA 9.1.84
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_ext
ended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts c
l_nv_create_buffer
  Platform Extensions function suffix             NV

  Platform Name                                   NVIDIA CUDA
Number of devices                                 3
  Device Name                                     GeForce GTX 1050
  Device Vendor                                   NVIDIA Corporation
  Device Vendor ID                                0x10de
  Device Version                                  OpenCL 1.2 CUDA
  Driver Version                                  390.42
  Device OpenCL C Version                         OpenCL C 1.2
  Device Type                                     GPU
  Device Available                                Yes
  Device Profile                                  FULL_PROFILE
  Device Topology (NV)                            PCI-E, 01:00.0
  Max compute units                               5
  Max clock frequency                             1468MHz
  Compute Capability (NV)                         6.1
  Device Partition                                (core)
    Max number of sub-devices                     1
    Supported partition types                     None
  Max work item dimensions                        3
  Max work item sizes                             1024x1024x64
  Max work group size                             1024
  Compiler Available                              Yes
  Linker Available                                Yes
  Preferred work group size multiple              32
  Warp size (NV)                                  32

  Preferred / native vector sizes
  char                                                 1 / 1
  short                                                1 / 1
  int                                                  1 / 1
  long                                                 1 / 1
  half                                                 0 / 0        (n/a)
  float                                                1 / 1
  double                                               1 / 1        (cl_khr_fp64)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
  Denormals                                     Yes
  Infinity and NANs                             Yes
  Round to nearest                              Yes
  Round to zero                                 Yes
  Round to infinity                             Yes
  IEEE754-2008 fused multiply-add               Yes
  Support is emulated in software               No
  Correctly-rounded divide and sqrt operations  Yes
  Double-precision Floating-point support         (cl_khr_fp64)
  Denormals                                     Yes
  Infinity and NANs                             Yes
  Round to nearest                              Yes
  Round to zero                                 Yes
  Round to infinity                             Yes
  IEEE754-2008 fused multiply-add               Yes
  Support is emulated in software               No
  Address bits                                    64, Little-Endian
  Global memory size                              2097479680 (1.953GiB)
  Error Correction support                        No
  Max memory allocation                           524369920 (500.1MiB)
  Unified memory for Host and Device              No
  Integrated memory (NV)                          No
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       4096 bits (512 bytes)
  Global Memory cache type                        Read/Write
  Global Memory cache size                        81920 (80KiB)
  Global Memory cache line size                   128 bytes
  Image support                                   Yes
  Max number of samplers per kernel             32
  Max size for 1D images from buffer            134217728 pixels
  Max 1D or 2D image array size                 2048 images
  Max 2D image size                             16384x32768 pixels
  Max 3D image size                             16384x16384x16384 pixels
  Max number of read image args                 256
  Max number of write image args                16
  Local memory type                               Local
  Local memory size                               49152 (48KiB)
  Registers per block (NV)                        65536
  Max constant buffer size                        65536 (64KiB)
  Max number of constant args                     9
  Max size of kernel argument                     4352 (4.25KiB)
  Queue properties
  Out-of-order execution                        Yes
  Profiling                                     Yes
  Prefer user sync for interop                    No
  Profiling timer resolution                      1000ns
  Execution capabilities
  Run OpenCL kernels                            Yes
  Run native kernels                            No
  Kernel execution timeout (NV)                 No
  Concurrent copy and kernel execution (NV)       Yes
  Number of async copy engines                  2
  printf() buffer size                            1048576 (1024KiB)
  Built-in kernels
  Device Extensions                               cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_ext
  ended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer

 ...

  NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  No platform
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   No platform
  clCreateContext(NULL, ...) [default]            No platform
  clCreateContext(NULL, ...) [other]              Success [NV]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  No platform

coredumpctl gdb 1708

PID: 1708 (xmr-stak)
UID: 0 (root)
GID: 0 (root)
Signal: 4 (ILL)
Timestamp: Fri 2018-03-09 13:45:41 UTC (6s ago)
Command Line: xmr-stak --config /root/xmr-stak.json --cpu /root/autogenerated_xmr-stak-cpu.json --amd /root/autogenerated_xmr-stak-amd.json --nvidia /root/autogenerated_xm$
-stak-nvidia.json
Executable: /usr/bin/xmr-stak
Control Group: /user.slice/user-0.slice/session-c1.scope
Unit: session-c1.scope
Slice: user-0.slice
Session: c1
Owner UID: 0 (root)
Boot ID: c6af424506db4894b01438fd33156212
Machine ID: 5ec813ef7c064899bc8ab4f68d753a32
Hostname: pickaxe-linux
Storage: /var/lib/systemd/coredump/core.xmr-stak.0.c6af424506db4894b01438fd33156212.1708.1520603141000000.lz4
Message: Process 1708 (xmr-stak) of user 0 dumped core.

Stack trace of thread 1708:
#0  0x00005572201b7d8d _ZN5jconf18check_cpu_featuresEv (xmr-stak)
#1  0x00005572201b7e92 _ZN5jconf12parse_configEPKc (xmr-stak)
2  0x000055722019e768 main (xmr-stak)
3  0x00007f620f6edf4a __libc_start_main (libc.so.6)
4  0x000055722019f93a _start (xmr-stak)

GNU gdb (GDB) 8.1

...

Reading symbols from /usr/bin/xmr-stak...(no debugging symbols found)...done.

warning: exec file is newer than core file.
[New LWP 1708]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Core was generated by `xmr-stak --config /root/xmr-stak.json --cpu /root/autogenerated_xmr-stak-cpu.js'.
Program terminated with signal SIGILL, Illegal instruction.
#0  0x00005572201b7d8d in jconf::check_cpu_features() ()

(gdb) backtrace

#0  0x00005572201b7d8d in jconf::check_cpu_features() ()
#1  0x00005572201b7e92 in jconf::parse_config(char const*) ()
#2  0x000055722019e768 in main ()
psychocrypt commented 6 years ago

do you compiled it on an other system? If so please look to the documentation you must enable a flag to generate generic code.

GSI commented 6 years ago

That was it! cmake -DXMR-STAK_COMPILE=generic did the trick. Thanks.