ethereum-mining / ethminer

Ethereum miner with OpenCL, CUDA and stratum support
GNU General Public License v3.0
5.96k stars 2.28k forks source link

v 0.18: SIGSEGV encountered ... #1895

Open stone212 opened 5 years ago

stone212 commented 5 years ago

Describe the bug Running ethminer 0.18 on local Parity node produces SIGSEGV encountered ... error

To Reproduce Steps to reproduce the behavior:

  1. Ubuntu 18.04 with Parity 2.3.9, mining enabled

  2. Start ethminer 0.18 with: /opt/ethminer/bin/ethminer -G -P stratum2+tcp://127.0.0.1:8008

  3. See error

# /opt/ethminer/bin/ethminer -G -P stratum2+tcp://127.0.0.1:8008

ethminer 0.18.0-alpha.3
Build: linux/release/gnu

i 21:17:03 ethminer Configured pool 127.0.0.1:8008
 i 21:17:03 ethminer Selected pool 127.0.0.1:8008
 i 21:17:03 ethminer Established connection to 127.0.0.1:8008
 i 21:17:03 ethminer Spinning up miners...
cl 21:17:03 cl-0     Using PciId : 01:00.0 Ellesmere OpenCL 1.2 AMD-APP (2671.3) Memory : 7.60 GB
 i 21:17:03 ethminer Stratum mode : ETHEREUMSTRATUM (NiceHash)
 i 21:17:03 ethminer Subscribed to stratum server
cl 21:17:03 cl-1     Using PciId : 03:00.0 Ellesmere OpenCL 1.2 AMD-APP (2671.3)
Memory : 7.96 GB
 i 21:17:03 ethminer Extranonce set to 0x2462b25c050517ec46ecf2f5163eb2f8ac37e3b09984f4c81fd1608d33f0f17d (nicehash)
 i 21:17:03 ethminer Authorized worker 
 m 21:17:08 ethminer 0:00 A0 0.00 h { cl0 0.00 | cl1 0.00 }
 m 21:17:13 ethminer 0:00 A0 0.00 h { cl0 0.00 | cl1 0.00 }
 m 21:17:18 ethminer 0:00 A0 0.00 h { cl0 0.00 | cl1 0.00 }
 m 21:17:23 ethminer 0:00 A0 0.00 h { cl0 0.00 | cl1 0.00 }
 m 21:17:28 ethminer 0:00 A0 0.00 h { cl0 0.00 | cl1 0.00 }
 m 21:17:33 ethminer 0:00 A0 0.00 h { cl0 0.00 | cl1 0.00 }
 m 21:17:38 ethminer 0:00 A0 0.00 h { cl0 0.00 | cl1 0.00 }
 i 21:17:42 ethminer Epoch : -1 Difficulty : 4.29 Gh
 i 21:17:42 ethminer Job: 03b8bc47… 127.0.0.1:8008
SIGSEGV encountered ...
stack trace:
backtrace() returned 7 addresses
/opt/ethminer/bin/ethminer() [0x420e29]
/lib/x86_64-linux-gnu/libc.so.6(+0x3ef20) [0x7f7083480f20]
/opt/ethminer/bin/ethminer() [0x6902fe]
/opt/ethminer/bin/ethminer() [0x4b74b6]
/opt/ethminer/bin/ethminer() [0x72a11f]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f7083df06db]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f708356388f]

#

Expected behavior A clear and concise description of what you expected to happen.

Screenshots (Optional) If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

/opt/ethminer/bin/ethminer -G -P stratum+tcp://127.0.0.1:8008

Additional context

  1. ethminer 0.12 works with this command: ./bin/ethminer -G -SP 127.0.0.1:8001

  2. stratum1+tcp does not work with this error:

    
    # /opt/ethminer/bin/ethminer -G -P stratum1+tcp://127.0.0.1:8008

ethminer 0.18.0-alpha.3 Build: linux/release/gnu

i 21:31:33 ethminer Configured pool 127.0.0.1:8008 i 21:31:33 ethminer Selected pool 127.0.0.1:8008 i 21:31:33 ethminer Could not login: code:-32601 message:Method not found i 21:31:33 ethminer Disconnected from 127.0.0.1:8008 i 21:31:33 ethminer No connection. Suspend mining ... i 21:31:33 ethminer No more connections to try. Exiting... i 21:31:33 main Got interrupt ... i 21:31:33 ethminer Terminated!



3. stratum2+tcp creates https://github.com/ethereum-mining/ethminer/issues/1896
AndreaLanfranchi commented 5 years ago

Stratum2+tcp (which is "EthereumStratum/1.0.0") is Nicehash mode. Specs here. https://github.com/nicehash/Specifications/blob/master/EthereumStratum_NiceHash_v1.0.0.txt

To my knowledge parity does not implement it.

stone212 commented 5 years ago

@AndreaLanfranchi

Stratum2+tcp (which is "EthereumStratum/1.0.0") is Nicehash mode.

I agree but it is still a problem to encounter this error. It is an Issue with ethash either because it is saying it is authorizing when it is not, or it is not failing with a good error message after it authorizes.

macperlita commented 4 years ago

Hello, I've the same error. Wht's a possible solution? Thx

AntonioDomenech commented 3 years ago

Same error here!

vinnymac commented 3 years ago

I had the same error as well when building ethminer in a Ubuntu 18.04 docker container. So I built this on Ubuntu 16.04 instead and for whatever reason it went away. Not sure what others are doing, but thought this might help for anyone who would like to use ethminer but cannot.

AntonioDomenech commented 3 years ago

Thanks! I'm currently using Ubuntu 18.04. So that may help...

PoseidonCoder commented 3 years ago

I get the same error on Arch Linux. It's incredible that this still hasn't been solved.

kurtjcu commented 3 years ago

present in ubuntu 20.04

jmaness commented 3 years ago

I am seeing a similar issue on Arch Linux. However, epoch is 0 instead of -1. If this is a separate issue, please let me know.

 i 17:31:23 ethminer Job: 6b305d4e… daggerhashimoto.usa-east.nicehash.com [172.65.202.202:3353]
 i 17:31:25 ethminer Extranonce set to 8fa200
 i 17:31:25 ethminer Epoch : 0 Difficulty : 420.34 Mh
 i 17:31:25 ethminer Job: 2d96b00b… daggerhashimoto.usa-east.nicehash.com [172.65.202.202:3353]
cu 17:31:25 cuda-0   Generating DAG + Light (reusing buffers): 4.21 GB
SIGSEGV encountered ...
stack trace:
backtrace() returned 19 addresses
ethminer(+0xab806) [0x561718c9d806]
/usr/lib/libc.so.6(+0x3cf80) [0x7f7a0a350f80]
/usr/lib/libcuda.so.1(+0x1d9f00) [0x7f7a085fdf00]
/usr/lib/libcuda.so.1(+0x2cb613) [0x7f7a086ef613]
/usr/lib/libcuda.so.1(+0x406555) [0x7f7a0882a555]
/usr/lib/libcuda.so.1(+0x1776ec) [0x7f7a0859b6ec]
/usr/lib/libcuda.so.1(+0x3e94d5) [0x7f7a0880d4d5]
/usr/lib/libcuda.so.1(+0x16fbf7) [0x7f7a08593bf7]
/usr/lib/libcuda.so.1(cuMemcpyHtoD_v2+0x56) [0x7f7a0864daa6]
ethminer(+0x3ac759) [0x561718f9e759]
ethminer(+0x383dd5) [0x561718f75dd5]
ethminer(+0x3c9ce9) [0x561718fbbce9]
ethminer(+0x368702) [0x561718f5a702]
ethminer(+0xfe779) [0x561718cf0779]
ethminer(+0x36b69d) [0x561718f5d69d]
ethminer(+0x152cf6) [0x561718d44cf6]
ethminer(+0x475604) [0x561719067604]
/usr/lib/libpthread.so.0(+0x9299) [0x7f7a0a649299]
/usr/lib/libc.so.6(clone+0x43) [0x7f7a0a413053]
vinnymac commented 3 years ago

Here is what worked for me using Ubuntu 16.04 but not 18.04 or higher, no idea why. Hope it helps.

FROM nvidia/cuda:9.0-devel-ubuntu16.04

LABEL maintainer="vinnymac"
LABEL description="Ethminer"

WORKDIR /root

RUN apt update \
  && apt install git -y \
  && apt install cmake libdbus-1-dev gcc mesa-common-dev perl -y

RUN git clone https://github.com/ethereum-mining/ethminer.git

WORKDIR /root/ethminer

RUN git checkout tags/v0.19.0 \
  && git submodule update --init --recursive \
  && mkdir build

WORKDIR /root/ethminer/build

RUN cmake .. \
  && cmake --build . \
  && make install

# Use SSL_NOVERIFY instead if you want to bypass certificate validation
RUN export SSL_CERT_FILE=/etc/ssl/certs/ca-certificates.crt

ENTRYPOINT ["/usr/local/bin/ethminer"]
CMD ["-h"]
riobard commented 3 years ago

Occurred on Ubuntu 16.04 as well

SIGSEGV encountered ...
stack trace:
backtrace() returned 19 addresses
bin/ethminer() [0x422af9]
/lib/x86_64-linux-gnu/libc.so.6(+0x354c0) [0x7ffa85c394c0]
/usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x156810) [0x7ffa8400f810]
/usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x21bc5e) [0x7ffa840d4c5e]
/usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x364e9f) [0x7ffa8421de9f]
/usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x21c3c5) [0x7ffa840d53c5]
/usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x11c872) [0x7ffa83fd5872]
/usr/lib/x86_64-linux-gnu/libcuda.so.1(+0x11dd76) [0x7ffa83fd6d76]
/usr/lib/x86_64-linux-gnu/libcuda.so.1(cuMemcpyHtoD_v2+0x73) [0x7ffa841a2993]
bin/ethminer() [0x72ff9e]
bin/ethminer() [0x700726]
bin/ethminer() [0x735deb]
bin/ethminer() [0x6e8a48]
bin/ethminer() [0x46f265]
bin/ethminer() [0x6eb4ec]
bin/ethminer() [0x4baff6]
bin/ethminer() [0x773aaf]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba) [0x7ffa864f66ba]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7ffa85d0b4dd]
akhockey21 commented 3 years ago

I'm having the same issue in Ubuntu 18.04.

mattglt commented 3 years ago

I have been using the following work around which requires commenting out the SIGSEGV handler so the program exits when it crashes and running continuously in a script.

diff --git a/ethminer/main.cpp b/ethminer/main.cpp
index f1aec3d5b..6c9e6c15d 100644
--- a/ethminer/main.cpp
+++ b/ethminer/main.cpp
@@ -746,7 +746,7 @@ public:

         // Signal traps
 #if defined(__linux__) || defined(__APPLE__)
-        signal(SIGSEGV, MinerCLI::signalHandler);
+        //signal(SIGSEGV, MinerCLI::signalHandler);
 #endif
         signal(SIGINT, MinerCLI::signalHandler);
         signal(SIGTERM, MinerCLI::signalHandler);

And using the following bash script:

while [ 1 ];
do
sleep 2
./ethminer/ethminer -U -P stratum2+tcp://xxxx@xxxx.com:xxxx -L 1 && break
done
playcat commented 3 years ago

I also did an attempt to "fix" the issue via bash script, but I avoided modifying the codebase and rebuilding (for sake of future updates on the repo).

Here's my approach - it's a bit customized for my system, but it shouldn't be too hard to modify it. it is 2 AM at the time of this edit. also, purpose of the edit is to state that the script is currently at version 0.02 and it can handle sigsegv signal in logs and also situation where we currently don't have a miner running. make sure to turn off the monitoring if you don't want to run the miner for some time :)

#!/bin/bash
while [ 1 ];
do
        echo "Doing a check... " >> ~/ethmonitoring/monitoring.log
        sleep 5
        # grab sigsegv if you can
        hasSig=$(tail -n18 ~/ethminer/ethmining.log  | grep SIGSEGV | wc -l)

        # check how many processes with "ethminer -G -P" we have
        processLive=$(($(ps x | grep "ethminer -G -P" | wc -l)-1))

        # grab the process id and make sure it's not the "grep" we're doing
        processId=$(ps x | grep "ethminer -G -P" | grep -v grep|awk '{print $1}')

        if [ "$hasSig" -ne "0" ] || [ "$processLive" -eq "0" ]; then
            if [ "$processLive" -ne "0" ]; then
                kill $processId
                echo "killing $processId" >> ~/ethmonitoring/monitoring.log
                ~/ethminer/bin/ethminer -G -P stratum1+ssl://WALLET.WORKER@POOL:PORT 2>>~/ethminer/ethmining.log >>~/ethminer/ethmining.log &
                disown
            else 
                echo "couldn't find ethminer running. starting from scratch... " >> ~/ethmonitoring/monitoring.log
                ~/ethminer/bin/ethminer -G -P stratum1+ssl://WALLET.WORKER@POOL:PORT 2>>~/ethminer/ethmining.log >>~/ethminer/ethmining.log &
                disown
            fi
        else
            echo "All good!" >> ~/ethmonitoring/monitoring.log
        fi
done

I will shamefully remove my comment if I find it's not working.

Script I saved and ran in bg (e.g. ./monitor.sh &).

anjandash commented 2 years ago

Other than the hack script(s) - is there a way to resolve this and locate the root cause of SIGSEGV faults? Did anyone resolve this?