ethereum-mining / ethminer

Ethereum miner with OpenCL, CUDA and stratum support
GNU General Public License v3.0
5.96k stars 2.28k forks source link

Crash with SIGSEGV Error #2258

Open Technologov opened 3 years ago

Technologov commented 3 years ago

i 07:30:02 ethminer Epoch : 407 Difficulty : 4.29 Gh i 07:30:02 ethminer Job: 0f58a90b… eu1.ethermine.org [172.65.207.106:4444] cu 07:30:03 cuda-0 Job: 95e9b901… Sol: 0x2523d3d911d6aa01 cu 07:30:05 cuda-0 Generating DAG + Light : 4.24 GB SIGSEGV encountered ... stack trace: backtrace() returned 19 addresses ./ethminer() [0x422af9] /lib/x86_64-linux-gnu/libc.so.6(+0x3f040) [0x7f7afcbef040] /usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x133a90) [0x7f7aebff7a90] /usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x1f017e) [0x7f7aec0b417e] /usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x2eeff7) [0x7f7aec1b2ff7] /usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x1f188e) [0x7f7aec0b588e] /usr/lib/x86_64-linux-gnu/libcuda.so.1(+0xfe91e) [0x7f7aebfc291e] /usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x1002d8) [0x7f7aebfc42d8] /usr/lib/x86_64-linux-gnu/libcuda.so.1(cuMemcpyHtoD_v2+0x65) [0x7f7aec13bf55] ./ethminer() [0x72ff9e] ./ethminer() [0x700726] ./ethminer() [0x735deb] ./ethminer() [0x6e8a48] ./ethminer() [0x46f265] ./ethminer() [0x6eb4ec] ./ethminer() [0x4baff6] ./ethminer() [0x773aaf] /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f7afd55e6db] /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f7afccd171f]

... this happens after several days of mining on NVIDIA RTX 20 series. Difficult to reproduce.

stkndrflw commented 3 years ago

Some time ago I had SIGSEVs bc I accidentally had this line

export GPU_FORCE_64BIT_PTR=0

in my ethminer startup file. Not sure if this helps, I just saw that my SIGSEV output was similar to yours.

mattglt commented 3 years ago

@Technologov I run into the same issue w/ my 1080 Ti

/lib/x86_64-linux-gnu/libc.so.6(clone+0x43) [0x7f860dafb293] cu 00:01:28 cuda-0 Generating DAG + Light (reusing buffers): 4.29 GB SIGSEGV encountered ... stack trace: backtrace() returned 19 addresses ./ethminer/ethminer(+0x9a430) [0x55bc6cfce430] /lib/x86_64-linux-gnu/libc.so.6(+0x46210) [0x7f860da1f210] /usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x1d9c40) [0x7f860bcc6c40] /usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x2e1943) [0x7f860bdce943] /usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x405f25) [0x7f860bef2f25] /usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x17783c) [0x7f860bc6483c] /usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x3d1215) [0x7f860bebe215] /usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x16fd47) [0x7f860bc5cd47] /usr/lib/x86_64-linux-gnu/libcuda.so.1(cuMemcpyHtoD_v2+0x56) [0x7f860bd3ed36] ./ethminer/ethminer(+0x3982de) [0x55bc6d2cc2de] ./ethminer/ethminer(+0x379bb6) [0x55bc6d2adbb6] ./ethminer/ethminer(+0x39afe8) [0x55bc6d2cefe8] ./ethminer/ethminer(+0x35cbe0) [0x55bc6d290be0] ./ethminer/ethminer(+0xe72ae) [0x55bc6d01b2ae] ./ethminer/ethminer(+0x35f828) [0x55bc6d293828] ./ethminer/ethminer(+0x131f26) [0x55bc6d065f26] ./ethminer/ethminer(+0x44c913) [0x55bc6d380913] /lib/x86_64-linux-gnu/libpthread.so.0(+0x9609) [0x7f860dd3e609] /lib/x86_64-linux-gnu/libc.so.6(clone+0x43) [0x7f860dafb293]

I'll build debug and investigate further

mattglt commented 3 years ago

More useful stack trace: 0 0x00007ffff5ef1c40 in cuEGLApiInit () from /usr/lib/x86_64-linux-gnu/libcuda.so.1 1 0x00007ffff5ff9943 in cuGetErrorString () from /usr/lib/x86_64-linux-gnu/libcuda.so.1 2 0x00007ffff611df25 in cudbgApiInit () from /usr/lib/x86_64-linux-gnu/libcuda.so.1 3 0x00007ffff5e8f83c in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1 4 0x00007ffff60e9215 in cudbgMain () from /usr/lib/x86_64-linux-gnu/libcuda.so.1 5 0x00007ffff5e87d47 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1 6 0x00007ffff5f69ed4 in cuMemcpyHtoD_v2 () from /usr/lib/x86_64-linux-gnu/libcuda.so.1 7 0x00005555558dda7e in cudart::driverHelper::memcpyDispatch(void, void const, unsigned long, cudaMemcpyKind, bool) () 8 0x00005555558bf356 in cudart::cudaApiMemcpy(void, void const, unsigned long, cudaMemcpyKind) () 9 0x00005555558e08b2 in cudaMemcpy () 10 0x00005555558a3000 in dev::eth::CUDAMiner::initEpoch_internal() () at /home/develop/ethminer/libethash-cuda/CUDAMiner.cpp:155 11 0x000055555563730d in dev::eth::Miner::initEpoch() () at /home/develop/ethminer/libethcore/Miner.cpp:153 12 0x00005555558a63b8 in dev::eth::CUDAMiner::workLoop() () at /home/develop/ethminer/libethash-cuda/CUDAMiner.cpp:217 13 0x000055555567c4d7 in dev::Worker::startWorking()::{lambda()#1}::operator()() const [clone .isra.39] () at /home/develop/ethminer/libdevcore/Worker.cpp:57 14 0x0000555555991e53 in execute_native_thread_routine () 15 0x00007ffff7f69609 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 16 0x00007ffff7d26293 in clone () from /lib/x86_64-linux-gnu/libc.so.6

potato1992 commented 3 years ago

same, ubuntu20.04.

nerdcorenet commented 3 years ago

Same here: https://paste.ubuntu.com/p/vwF6VFxvQM/

Relevant lib versions on my system: cuda 11.3.0-1 libc6 2.31-0ubuntu9 libpthread-stubs0-dev 0.4-1

Worked for several hours or days before this occurred. I was running a second ethminer-0.18.0 instance on the same machine but operating different GPUs, and it remained functional when the first instance segfaulted. shrugs

nerdcorenet commented 3 years ago

This issue appears to be a duplicate of #1937