ethereum-mining / ethminer

Ethereum miner with OpenCL, CUDA and stratum support
GNU General Public License v3.0
5.97k stars 2.28k forks source link

0.14.0.dev4 windows - no no response received in 2 seconds. #936

Closed paddymahoney closed 6 years ago

paddymahoney commented 6 years ago

Getting a fatal error on 0.14.0.dev4 on windows:

 m  00:51:37|main    |  Speed  85.62 Mh/s    gpu/0 53.58  gpu/1 32.04  [A2098+33:R0+0:F0] Time: 27:45
  m  00:51:42|main    |  Speed  86.00 Mh/s    gpu/0 53.84  gpu/1 32.16  [A2098+33:R0+0:F0] Time: 27:45
  m  00:51:47|main    |  Speed  86.15 Mh/s    gpu/0 53.88  gpu/1 32.27  [A2098+33:R0+0:F0] Time: 27:45
  m  00:51:52|main    |  Speed  86.28 Mh/s    gpu/0 53.93  gpu/1 32.36  [A2098+33:R0+0:F0] Time: 27:45
  m  00:51:57|main    |  Speed  86.28 Mh/s    gpu/0 53.93  gpu/1 32.36  [A2098+33:R0+0:F0] Time: 27:45
  m  00:52:02|main    |  Speed  86.12 Mh/s    gpu/0 53.77  gpu/1 32.35  [A2098+33:R0+0:F0] Time: 27:45
  m  00:52:07|main    |  Speed  86.27 Mh/s    gpu/0 53.92  gpu/1 32.35  [A2098+33:R0+0:F0] Time: 27:45
  i  00:52:08|cuda-1  |  Nonce 0x618963502b3cfa7d submitted to eu1.ethermine.org
  X  00:52:11|stratum |  No no response received in 2 seconds.
  m  00:52:12|main    |  Speed  86.43 Mh/s    gpu/0 54.08  gpu/1 32.36  [A2098+33:R0+0:F0] Time: 27:45
  i  00:52:14|cuda-0  |  Nonce 0x6189625068aafb54 submitted to eu1.ethermine.org
  m  00:52:17|main    |  Speed  86.17 Mh/s    gpu/0 53.86  gpu/1 32.31  [A2098+33:R0+0:F0] Time: 27:45
  m  00:52:22|main    |  Speed  86.32 Mh/s    gpu/0 54.01  gpu/1 32.31  [A2098+33:R0+0:F0] Time: 27:45
  m  00:52:27|main    |  Speed  86.47 Mh/s    gpu/0 54.16  gpu/1 32.31  [A2098+33:R0+0:F0] Time: 27:46
  i  00:52:28|cuda-1  |  Nonce 0x618963504febe088 submitted to eu1.ethermine.org
  i  00:52:31|stratum |  Disconnected from eu1.ethermine.org
  i  00:52:31|stratum |  Shutting down miners...
  i  00:52:31|stratum |  Retrying in 3 ...
  i  00:52:32|stratum |  Retrying in 2 ...
  m  00:52:32|main    |  not-connected
  i  00:52:33|stratum |  Retrying in 1 ...
  X  00:52:34|stratum |  Handle response failed: protocol is shutdown
  X  00:52:34|stratum |  Handle response failed: protocol is shutdown
  m  13:00:01|main    |  not-connected
digitalpara commented 6 years ago

the same also here with this version ethminer 0.14.0rc4, until now it becomes a misery with mining ethminer 0.14.0rc4, miner stops and goes no further, please watchdog please just like Phoenixminer or Claymore there are enough examples but ethminer developers are sleeping .

i 14: 52: 31 | stratum | Disconnected from eu1.ethermine.org i 14: 52: 31 | stratum | Shutting down miners ... i 14: 52: 31 | stratum | Retrying in 3 ... i 14: 52: 32 | stratum | Retrying in 2 ... m 14: 52: 32 | main | not-connected i 14: 52: 33 | stratum | Retrying in 1 ... X 14: 52: 34 | stratum | Handle response failed: protocol is shutdown X 14: 52: 34 | stratum | Handle response failed: protocol is shutdown m 15: 00: 01 | main | not-connected

with --stratum-ssl specified. i use this with ethminer Download: https://github.com/digitalpara/WiNETH Download: https://mirrorace.com/m/1qikt

shawnsmithdev commented 6 years ago

Same. Win10, GTX 1070. Won't post logs as it was default verbosity and look basically the same, so probably not helpful. I turn on a lot of security stuff like DEP and Manditory ASLR and such, if that matters.

cli args:

ethminer --farm-retries 10 --farm-recheck 2000 -U -RH -HWMON 1 \
-P stratum+ssl://0x0000000000000000000000000000000000000000.worker@us2.ethermine.org:5555 \
-P stratum+ssl://0x0000000000000000000000000000000000000000.worker@us1.ethermine.org:5555
AndreaLanfranchi commented 6 years ago

Please redo your tests with ethminer 0.14.0rc6

AndreaLanfranchi commented 6 years ago

@digitalpara

but ethminer developers are sleeping .

Please keep always in mind that we're giving our spare time for free. No one gets paid for this nor you pay a single dime for using ethminer.

ghost commented 6 years ago

@digitalpara THANK YOU FOR ALL THE WORK AND YOUR FREE TIME!!!!

@Topic: 0.14.0rc6 also not working for me. Same error on PIMP 2.5

AndreaLanfranchi commented 6 years ago

Solved in

paddymahoney commented 6 years ago

Confirm 0.15.dev6 resolves my issue

Update: Ah too bad, ran into the error again with 0.15.dev6: X 10:33:38|stratum | No response received in 2 seconds. i 10:33:39|stratum | Disconnected from us1.ethermine.org [18.219.59.155:5555] i 10:33:42|stratum | Shutting down miners... m 10:33:44|main | not-connected i 10:33:46|stratum | Retrying in 3 ... m 10:33:49|main | not-connected i 10:33:49|stratum | Retrying in 2 ... i 10:33:50|stratum | Retrying in 1 ...

Windows then reports that ethminer.exe stopped working. I would be happy if the miner continued to run rather than die, just so that I don't have to write a process supervisor

ghost commented 6 years ago

Its not fixed for me on Linux with 0.15.dev6

image

AndreaLanfranchi commented 6 years ago

Please try latest. We've pushed some little changes few minutes ago. And report back.

ghost commented 6 years ago

kk, give it a try.

ghost commented 6 years ago

Same Error after about 10 Minutes.

Is there a log I could provide to help you find the responsible code? If it's relevant, im using SSL right now. Should I test without?

AndreaLanfranchi commented 6 years ago

Well ... actually this is not really a bug. You're suffering from a huge delay in ethermine.org's responses to your submissions. Do not know about your connection and wether it's stable or not (do you have dynamic ip ?).

Aside from this the behavior of ethminer is correct even if I agree with you 1 minute to reconnect is not acceptable. Those "not-connected" warnings come from a process which is locking.

Will report back.

ghost commented 6 years ago

I use a Cable-Connection with 400mbit down/20 mbit up. Normally my submissions take ~20ms. It is dynamic IP but the IP changes every 6 - 8 Weeks.

OK, thank you. I try to provide fast feedback if you have a new release.

AndreaLanfranchi commented 6 years ago

@faithless1108 can you describe your environment ? CPU and flavour (x86/x64) ? OS and flavour

Thanks

ghost commented 6 years ago

Hey,

sure... All it takes to solve this.

CPU: CPU Intel Celeron G3900 2x 2.80GHz (64Bit) MoBo: ASRock BTC Pro H110 RAM: 4GB (Cheapest I could find, dont know the Vendor)

OS: PIMP 2.5 (https://getpimp.org/, I think its a modified Ubuntu 16.04.4 LTS)

Connected via LAN-Cable to Router. Internetconnection is a verry stable Calbe-Broadband-Connection with 400Mbit down- / 20 Mbit upstream.

You need something else?

Thanks for your greate support Mate!!

ghost commented 6 years ago

I found the Invoce. Its Crucial RAM

OS: NAME="Ubuntu" VERSION="16.04.4 LTS (Xenial Xerus)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 16.04.4 LTS" VERSION_ID="16.04" HOME_URL="http://www.ubuntu.com/" SUPPORT_URL="http://help.ubuntu.com/" BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/" VERSION_CODENAME=xenial UBUNTU_CODENAME=xenial

AndreaLanfranchi commented 6 years ago

Do you have other rigs ? If so .. it happens only on one of them or it's random ?

From the log I see, apparently, your machine takes 20 seconds to run a very simple disconnection of the socket (code follows) which, coupled with your exhuberant network configuration, leads me to think you might have a network card error.

This code takes 20 seconds to run on your machine

    if (m_socket && m_socket->is_open()) { 

        try {

            boost::system::error_code sec;

            if (m_conn.SecLevel() != SecureLevel::NONE) {
                m_securesocket->shutdown(sec);
            }
            else {
                m_nonsecuresocket->shutdown(boost::asio::ip::tcp::socket::shutdown_both, sec);
            }

            m_socket->close();
            m_io_service.stop();
        }
        catch (std::exception const& _e) {
            cwarn << "Error while disconnecting:" << _e.what();
        }

        m_securesocket = nullptr;
        m_nonsecuresocket = nullptr;
        m_socket = nullptr;
    }
ghost commented 6 years ago

It's the only Rig I have.

But with Claymore-Miner there is absolutely no problem. SSH and all the Stuff works and is stable (Miner-Test for Hours). I really dont think it's the Card...

AndreaLanfranchi commented 6 years ago

In other words ...

shutdown : this should ensure that any pending operations on the socket are properly cancelled and any buffers are flushed prior to the call to socket.close.

but ... you're shutting down because you previously had a 2 seconds response timeout on the socket thus, somehow, your network card (or it's driver) was already non responding.

I've seen similar problems with same MOBO as yours at high overclocking values (network card becomes unresponsive and takes time to "restore").

Would ask you to :

AndreaLanfranchi commented 6 years ago

But with Claymore-Miner there is absolutely no problem. SSH and all the Stuff works and is stable (Miner-Test for Hours). I really dont think it's the Card...

We do not know how Claymore implements socket management as it's code is closed. Maybe they don't care about a clean shutdown (and drop the socket abruptly) or don't care about the 2 seconds timeout on response.

ghost commented 6 years ago

OK here is DMESG Output. I will turn down OC and retest.

[51125.588443] stratum[3210]: segfault at 0 ip 0000000000530005 sp 00007f5324ccf780 error 4 in ethminer[400000+8f6000] [51323.829224] stratum[8965]: segfault at 28 ip 0000000000533c25 sp 00007f950afb8600 error 4 in ethminer[400000+8f6000] [52728.318270] stratum[14976]: segfault at 28 ip 0000000000548c25 sp 00007f31f1e5b5a0 error 4 in ethminer[400000+912000] [53576.010483] stratum[23236]: segfault at 0 ip 0000000000545005 sp 00007fcafdf36720 error 4 in ethminer[400000+912000] [55937.909989] stratum[3066]: segfault at 0 ip 0000000000545005 sp 00007f060fc60720 error 4 in ethminer[400000+912000] [56065.028380] stratum[25578]: segfault at 0 ip 0000000000545005 sp 00007fa63744b720 error 4 in ethminer[400000+912000] [58443.937327] stratum[28350]: segfault at 0 ip 0000000000545005 sp 00007f0c0b7fc720 error 4 in ethminer[400000+912000] [58927.095611] stratum[24083]: segfault at 0 ip 0000000000545005 sp 00007fe60a899720 error 4 in ethminer[400000+912000] [59003.544400] stratum[31441]: segfault at 0 ip 0000000000545005 sp 00007f722fffd720 error 4 in ethminer[400000+912000] [59281.624316] stratum[6378]: segfault at 0 ip 0000000000545005 sp 00007f4fd900f720 error 4 in ethminer[400000+912000] [59405.937630] stratum[9416]: segfault at 0 ip 0000000000545005 sp 00007fd32cc11720 error 4 in ethminer[400000+912000] [59733.962419] stratum[14835]: segfault at 0 ip 0000000000545005 sp 00007fb1c587b720 error 4 in ethminer[400000+912000] [59935.803082] stratum[19768]: segfault at 0 ip 0000000000545005 sp 00007fb57ccda720 error 4 in ethminer[400000+912000] [62757.081930] stratum[31374]: segfault at 0 ip 0000000000545005 sp 00007fb96a83a720 error 4 in ethminer[400000+912000] [63901.208425] stratum[16150]: segfault at 0 ip 0000000000530005 sp 00007fc62e7fa780 error 4 in ethminer[400000+8f6000] [63962.155699] stratum[19783]: segfault at 0 ip 0000000000530005 sp 00007f5dd0de5780 error 4 in ethminer[400000+8f6000] [64774.659709] stratum[3905]: segfault at 0 ip 0000000000545005 sp 00007f9c937fc720 error 4 in ethminer[400000+912000] [99601.601113] stratum[30824]: segfault at 0 ip 0000000000545005 sp 00007f4ff17f8720 error 4 in ethminer[400000+912000] [117424.506637] NVRM: GPU at PCI:0000:01:00: GPU-81cd05f7-c980-2da1-548d-3fd0bd11b467 [117424.506639] NVRM: GPU Board Serial Number: [117424.506641] NVRM: Xid (PCI:0000:01:00): 31, Ch 00000013, engmask 00000101, intr 10000000 [201856.641368] NVRM: GPU at PCI:0000:0f:00: GPU-f20881bf-564b-e590-2d5a-550fd51f88ff [201856.641371] NVRM: GPU Board Serial Number: [201856.641373] NVRM: Xid (PCI:0000:0f:00): 31, Ch 00000013, engmask 00000101, intr 10000000 [216913.765332] NVRM: GPU at PCI:0000:0e:00: GPU-82beea07-282d-b430-1627-1a56cbb1b797 [216913.765334] NVRM: GPU Board Serial Number: [216913.765336] NVRM: Xid (PCI:0000:0e:00): 31, Ch 00000013, engmask 00000101, intr 10000000 [235491.030270] NVRM: Xid (PCI:0000:0e:00): 31, Ch 00000013, engmask 00000101, intr 10000000 [245616.267109] perf: interrupt took too long (3967 > 3946), lowering kernel.perf_event_max_sample_rate to 50250 [268258.303253] stratum[8787]: segfault at 221d150 ip 000000000221d150 sp 00007f6c18f61b38 error 15 [269535.986400] stratum[27157]: segfault at 2ee5f60 ip 0000000002ee5f60 sp 00007f938cc47b38 error 15 [271120.134178] stratum[31027]: segfault at 2cb8ed0 ip 0000000002cb8ed0 sp 00007fe776374b78 error 15 [274452.629114] stratum[25487]: segfault at 240ced0 ip 000000000240ced0 sp 00007f88aedccb78 error 15 [277440.151200] stratum[28755]: segfault at 203ded0 ip 000000000203ded0 sp 00007fc1024f1b78 error 15 [278934.978085] stratum[32430]: segfault at 17d9ed0 ip 00000000017d9ed0 sp 00007efcb4e56b78 error 15 [279355.290977] stratum[11134]: segfault at 1b56ed0 ip 0000000001b56ed0 sp 00007f072a4bab78 error 15 [279911.053137] stratum[14558]: segfault at 17b8ed0 ip 00000000017b8ed0 sp 00007f81210e3b78 error 15

AndreaLanfranchi commented 6 years ago

Segfault error 4 means thread name "stratum" could not read and the only thing it reads is socket data over the network card (look ip - instruction pointer is always the same 0000000000545005 - we have only one instruction where we invoke read). This causes your 2 seconds reponse time-out

Segfault error 15 means stratum could not write to socket.

Think it's machine related ... not code related.

ghost commented 6 years ago

Think it's machine related ... not code related.<

I'm with you. I try to do the things you gave me for homework and look how it goes... Otherwise I'm addicted to Claymore :(

paddymahoney commented 6 years ago

Update: Ah too bad, ran into the error again with 0.15.dev6: X 10:33:38|stratum | No response received in 2 seconds. i 10:33:39|stratum | Disconnected from us1.ethermine.org [18.219.59.155:5555] i 10:33:42|stratum | Shutting down miners... m 10:33:44|main | not-connected i 10:33:46|stratum | Retrying in 3 ... m 10:33:49|main | not-connected i 10:33:49|stratum | Retrying in 2 ... i 10:33:50|stratum | Retrying in 1 ...

Windows then reports that ethminer.exe stopped working. I would be happy if the miner continued to run rather than die, just so that I don't run it under a supervisor

AndreaLanfranchi commented 6 years ago

Solved in #1135