rodneymo / rig-monitorv2

rig-monitor implementation in Golang
10 stars 3 forks source link

Ethminer - seems not working #49

Open StefanOberhumer opened 6 years ago

StefanOberhumer commented 6 years ago

As it seems ethminer is not working

configured as follows:

rigList = [
"rig01,ethminer,label_ethermine,,//192.168.XXX.YYY:3333,8,183,0,noplug,,640,70",
"rig02,ethminer,label_ethermine,,//192.168.XXX.ZZZ:3333,8,183,0,noplug,,640,70",
]

1.) I got no data in grafana 2.) I see no message in ethminer which looks like i 18:41:13 Api New api session from 192.168.XXX.AAA:55940

So I think its not working

Reference to Issue #26

rodneymo commented 6 years ago

Could you send me a trace? ./grm -t -m -1 -p -1 -r 1 Juts let it run for 1 min

StefanOberhumer commented 6 years ago

Here it is:

./grm -t -m -1 -p -1 -r 1 
INFO: 2018/07/02 19:26:59 main.go:47: Starting rig-monitor version 3.0.d.1 ...
INFO: 2018/07/02 19:26:59 main.go:59: Tracing is enabled!
INFO: 2018/07/02 19:26:59 main.go:63: Commmand line arguments: -config 
INFO: 2018/07/02 19:26:59 main.go:67: No config file specified. Using default config.toml file: config.toml
INFO: 2018/07/02 19:26:59 config.go:61: Reading configuration file...
TRACE: 2018/07/02 19:26:59 config.go:169: Config file: &config.Config{Grafana:config.GrafanaStruct{Username:"grafana", Password:"grafana"}, Influxdb:storage.InfluxDbStruct{Hostname:"http://localhost:8086", Database:"rigdata", Username:"grafana", Password:"grafana", WriteInterval:0x1e, InfluxWriteBufferSize:0x3e8}, Dynu:config.DynuStruct{Enabled:false, Username:"dynu", Password:"dynu", Hostname:"host.dynu.com", UpdateInterval:1800}, Main:config.Mainstruct{PoolWorkers:3, RigWorkers:5, PoolPollingInterval:300, RigPollingInterval:60, Rig:[]string{"rig01,ethminer,label_ethermine,,//192.168.XXX.YYY:3333,8,183,0,noplug,,640,70"}, Pool:[]string{"label_ethermine,ethermine,eth,https://api.ethermine.org,,MYWALLET,1"}, PowerRules:[]string{}}, Profitability:config.Profitabilitystruct{SmartPlugPollingInterval:60, MarketPollingInterval:300, QuoteCurrency:"EUR", PowerCostKwh:0.17, PowerRatioDualMining:0.3}, PoolList:[]pool.PoolConfig{pool.PoolConfig{Label:"label_ethermine_company", Type:"ETHERMINE", Crypto:"eth", URL:"https://api.ethermine.org", Token:"", Wallet:"MYWALLET", CorrectionFactor:1}}, RigList:[]miner.RigConfig{miner.RigConfig{RigName:"rig01", Miner:"ETHMINER", PoolLabel:"label_ethermine_company", PoolLabel2:"", URL:"//192.168.XXX.YYY:3333", InstalledGpus:8, TargetHashRate:183, TargetHashRate2:0, SmartPlugType:"", SmartPlugIP:"", PSUMaxPower:0, TargetTemperature:70}}, SmartPlugList:[]power.SmartPlugConfig{power.SmartPlugConfig{RigName:"rig01", Miner:"ETHMINER", PoolLabel:"label_ethermine_company", PoolLabel2:"", SmartPlugType:"NOPLUG", SmartPlugIP:"", PSUMaxPower:640, RuleList:[]power.PowerMgmtRuleStruct(nil), LastReset:0}}}
INFO: 2018/07/02 19:26:59 main.go:87: Commmand line arguments: -m -1
INFO: 2018/07/02 19:26:59 main.go:101: Commmand line arguments: -p -1
INFO: 2018/07/02 19:26:59 main.go:112: Commmand line arguments: -r 1
TRACE: 2018/07/02 19:26:59 main.go:125: List of rigs: []miner.RigConfig{miner.RigConfig{RigName:"rig01", Miner:"ETHMINER", PoolLabel:"label_ethermine_company", PoolLabel2:"", URL:"//192.168.XXX.YYY:3333", InstalledGpus:8, TargetHashRate:183, TargetHashRate2:0, SmartPlugType:"", SmartPlugIP:"", PSUMaxPower:0, TargetTemperature:70}}
TRACE: 2018/07/02 19:26:59 main.go:126: List of pools: []pool.PoolConfig{}
INFO: 2018/07/02 19:26:59 influxdb.go:26: Starting DBDaemon routine...
INFO: 2018/07/02 19:26:59 influxdb.go:94: Creating influxDB connection to http://localhost:8086 ...
INFO: 2018/07/02 19:27:00 main.go:151: Launching rig monitor worker: 0
INFO: 2018/07/02 19:27:00 main.go:151: Launching rig monitor worker: 1
INFO: 2018/07/02 19:27:00 main.go:151: Launching rig monitor worker: 2
INFO: 2018/07/02 19:27:00 main.go:151: Launching rig monitor worker: 3
INFO: 2018/07/02 19:27:00 main.go:151: Launching rig monitor worker: 4
INFO: 2018/07/02 19:27:00 main.go:160: Launching power monitor worker: 0
INFO: 2018/07/02 19:27:00 main.go:162: Launching power rules manager worker: 0
INFO: 2018/07/02 19:27:00 main.go:160: Launching power monitor worker: 1
INFO: 2018/07/02 19:27:00 main.go:162: Launching power rules manager worker: 1
INFO: 2018/07/02 19:27:00 main.go:160: Launching power monitor worker: 2
INFO: 2018/07/02 19:27:00 main.go:162: Launching power rules manager worker: 2
INFO: 2018/07/02 19:27:00 main.go:160: Launching power monitor worker: 3
INFO: 2018/07/02 19:27:00 main.go:162: Launching power rules manager worker: 3
INFO: 2018/07/02 19:27:00 main.go:160: Launching power monitor worker: 4
INFO: 2018/07/02 19:27:00 main.go:162: Launching power rules manager worker: 4
INFO: 2018/07/02 19:27:00 influxdb.go:94: Creating influxDB connection to http://localhost:8086 ...
INFO: 2018/07/02 19:27:00 power-monitor.go:25: New power monitor job received: rig01
INFO: 2018/07/02 19:27:00 influxdb.go:94: Creating influxDB connection to http://localhost:8086 ...
INFO: 2018/07/02 19:27:00 influxdb.go:94: Creating influxDB connection to http://localhost:8086 ...
INFO: 2018/07/02 19:27:00 influxdb.go:94: Creating influxDB connection to http://localhost:8086 ...
INFO: 2018/07/02 19:27:00 rig-monitor.go:21: New rig monitor job received: rig01
TRACE: 2018/07/02 19:27:00 rig-monitor.go:23: New rig monitor job received: &miner.RigConfig{RigName:"rig01", Miner:"ETHMINER", PoolLabel:"label_ethermine_company", PoolLabel2:"", URL:"//192.168.XXX.YYY:3333", InstalledGpus:8, TargetHashRate:183, TargetHashRate2:0, SmartPlugType:"", SmartPlugIP:"", PSUMaxPower:0, TargetTemperature:70}
INFO: 2018/07/02 19:27:00 influxdb.go:94: Creating influxDB connection to http://localhost:8086 ...
TRACE: 2018/07/02 19:27:00 power-monitor.go:27: New power monitor job received: &power.SmartPlugConfig{RigName:"rig01", Miner:"ETHMINER", PoolLabel:"label_ethermine_company", PoolLabel2:"", SmartPlugType:"NOPLUG", SmartPlugIP:"", PSUMaxPower:640, RuleList:[]power.PowerMgmtRuleStruct(nil), LastReset:0}
INFO: 2018/07/02 19:27:00 rig-monitor.go:82:  Connection to rig rig01 OK.
TRACE: 2018/07/02 19:27:00 influxdb.go:54: New record received by DBDaemon: env_data map[label:label_ethermine_company plug_type:NOPLUG rig_id:rig01] map[max_power:640 power_usage:640] <nil> 2018-07-02 19:27:00.480742548 +0200 CEST m=+1.028741029
TRACE: 2018/07/02 19:27:20 functions.go:139: Alloc = 1 MiB      TotalAlloc = 1 MiB      Sys = 5 MiB     NumGC = 0
TRACE: 2018/07/02 19:27:20 functions.go:147: Number of Go routines running: 20
TRACE: 2018/07/02 19:27:20 main.go:230: Number of items in rigJobQueue: 0
TRACE: 2018/07/02 19:27:20 main.go:231: Number of items in powerJobQueue: 0
TRACE: 2018/07/02 19:27:20 main.go:232: Number of items in poolJobQueue: 0
TRACE: 2018/07/02 19:27:20 main.go:233: Number of items in marketJobQueue: 1
TRACE: 2018/07/02 19:27:20 main.go:234: Number of items in rulesManagerJobQueue: 0
TRACE: 2018/07/02 19:27:20 main.go:235: Number of items in recordQueue(DB): 0
INFO: 2018/07/02 19:27:29 influxdb.go:46: DBDaemon ticker expired (30s). Data points (1) saved to influxDB
TRACE: 2018/07/02 19:27:40 functions.go:139: Alloc = 1 MiB      TotalAlloc = 1 MiB      Sys = 5 MiB     NumGC = 0
TRACE: 2018/07/02 19:27:40 functions.go:147: Number of Go routines running: 22
TRACE: 2018/07/02 19:27:40 main.go:230: Number of items in rigJobQueue: 0
TRACE: 2018/07/02 19:27:40 main.go:231: Number of items in powerJobQueue: 0
TRACE: 2018/07/02 19:27:40 main.go:232: Number of items in poolJobQueue: 0
TRACE: 2018/07/02 19:27:40 main.go:233: Number of items in marketJobQueue: 1
TRACE: 2018/07/02 19:27:40 main.go:234: Number of items in rulesManagerJobQueue: 0
TRACE: 2018/07/02 19:27:40 main.go:235: Number of items in recordQueue(DB): 0
INFO: 2018/07/02 19:27:59 influxdb.go:46: DBDaemon ticker expired (30s). Data points (0) saved to influxDB
TRACE: 2018/07/02 19:28:00 functions.go:139: Alloc = 1 MiB      TotalAlloc = 1 MiB      Sys = 5 MiB     NumGC = 0
TRACE: 2018/07/02 19:28:00 functions.go:147: Number of Go routines running: 22
TRACE: 2018/07/02 19:28:00 main.go:230: Number of items in rigJobQueue: 0
TRACE: 2018/07/02 19:28:00 main.go:231: Number of items in powerJobQueue: 0
TRACE: 2018/07/02 19:28:00 main.go:232: Number of items in poolJobQueue: 0
TRACE: 2018/07/02 19:28:00 main.go:233: Number of items in marketJobQueue: 1
TRACE: 2018/07/02 19:28:00 main.go:234: Number of items in rulesManagerJobQueue: 0
TRACE: 2018/07/02 19:28:00 main.go:235: Number of items in recordQueue(DB): 0
INFO: 2018/07/02 19:28:00 power-monitor.go:25: New power monitor job received: rig01
INFO: 2018/07/02 19:28:00 rig-monitor.go:21: New rig monitor job received: rig01
TRACE: 2018/07/02 19:28:00 rig-monitor.go:23: New rig monitor job received: &miner.RigConfig{RigName:"rig01", Miner:"ETHMINER", PoolLabel:"label_ethermine", PoolLabel2:"", URL:"//192.168.XXX.YYY:3333", InstalledGpus:8, TargetHashRate:183, TargetHashRate2:0, SmartPlugType:"", SmartPlugIP:"", PSUMaxPower:0, TargetTemperature:70}
TRACE: 2018/07/02 19:28:00 power-monitor.go:27: New power monitor job received: &power.SmartPlugConfig{RigName:"rig01", Miner:"ETHMINER", PoolLabel:"label_ethermine", PoolLabel2:"", SmartPlugType:"NOPLUG", SmartPlugIP:"", PSUMaxPower:640, RuleList:[]power.PowerMgmtRuleStruct(nil), LastReset:0}
TRACE: 2018/07/02 19:28:00 influxdb.go:54: New record received by DBDaemon: env_data map[label:label_ethermine plug_type:NOPLUG rig_id:rig01] map[max_power:640 power_usage:640] <nil> 2018-07-02 19:28:00.483485491 +0200 CEST m=+61.031484531
INFO: 2018/07/02 19:28:00 rig-monitor.go:82:  Connection to rig rig01 OK.
rodneymo commented 6 years ago

Do you get an ERROR message after the "INFO: 2018/07/02 19:28:00 rig-monitor.go:82: Connection to rig rig01 OK." ?

StefanOberhumer commented 6 years ago

No, - I'm not getting any error See in the trace at 19:27 INFO: 2018/07/02 19:27:00 rig-monitor.go:82: Connection to rig rig01 OK.

rodneymo commented 6 years ago

Ok, so I assume there's no timeout. Right now I have coded the ethminer parser to wait for null (x00) character. Do you know what's the termination character included in ethminer's response? If you don't know then I'll setup my test rig tomorrow and debug it (currently I am using a simulator that serves the output from a text file)

StefanOberhumer commented 6 years ago

I also made a tcpdump -n dst host 192.168.XXX.YYY (where 192.168.XXX.YYY is my rig address) and nothing was logged - so it seems nothing getting sent to the rig. Also there is no .... Api New api session from 192.168.XXX.AAA:.... within ethminer log - it seems you don't connect the rig.

rodneymo commented 6 years ago

OK. I will test it tomorrow and get back you asap

StefanOberhumer commented 6 years ago

Ok, so I assume there's no timeout. Right now I have coded the ethminer parser to wait for null (x00) character. Do you know what's the termination character included in ethminer's response?

I think the json request must be terminated by a "\n". Have tested using following script and found that the response is also is terminated by a "\n" (0x0a)

#! /usr/bin/env python
import os, sys
import socket

HOST = sys.argv[1]
PORT = int(sys.argv[2])              # The same port as used by the server
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.settimeout(10.0)
try:
        s.connect((HOST, PORT))
        a="{\"id\":0,\"jsonrpc\":\"2.0\",\"method\":\"miner_getstat1\}\n"
        s.sendall(a)
        data = s.recv(2048)
        s.close()
        open("resp.bin","wb").write(data)
except:
        pass
sys.exit(0)
rodneymo commented 6 years ago

That was it. I modified the TCP call to accept a miner specified terminator. I'll include the fix with the 3.0 stable, which I'll release within the hour

rodneymo commented 6 years ago

Please close it once you have successfully tested it.

StefanOberhumer commented 6 years ago

Thank you very much - it works - but: It seems as if grm can connect to the rig but does not get any respond (keeps connected) all the stats are getting wrong as (I just think) one rig data collector task keeps locked and waits for response.

So maybe you can add a timeout to get the responses.

Not sure if this case only addresses ethminer ?

rodneymo commented 6 years ago

damn. I only let it run for 1 round. Let me test it properly for a few minutes.

StefanOberhumer commented 6 years ago

To Test: Write a small TCPIP Listener/Acceptor and do not respond to the request but keep connection alive on one of your rigs.

rodneymo commented 6 years ago

Thank you very much - it works - but: It seems as if grm can connect to the rig but does not get any respond (keeps connected) all the stats are getting wrong as (I just think) one rig data collector task keeps locked and waits for response.

@StefanOberhumer, are you gettin this error with all your miners running ethminer? I just tested it on my rig and I cannot replicate this problem. Is this a issue occurring because the miner is "freezing"? Sorry, but it's not clear to me when and how this issue is occurring.

StefanOberhumer commented 6 years ago

I'm trying to reproduce....

StefanOberhumer commented 6 years ago

Run following script (simulates a 8 card rig which freezes) and add it to grm conf

#! /usr/bin/env python
## vim:set ts=4 sw=4 et: -*- coding: utf-8 -*-

# simulate API response of a 8 card rig

import os, sys, time
import threading
import socket, json

class globs:
    HOST = ''
    PORT = 3333
    error_after_x_requests = 2
    threads = []

class RigApiThread(threading.Thread):
    def __init__(self, ip, port, socket):
        threading.Thread.__init__(self)
        self.ip = ip
        self.unprocessed_read = ""
        self.port = port
        self.socket = socket
        self.do_run = True

    def receive_till(self, till, max_buffer=500):
        r = ""
        p = -1
        while p == -1 and self.do_run:
            if len(r) >= max_buffer:
                # seems we got some scrambled data - throw it away and try a fresh start
                r = ""

            r += self.socket.recv(1024)
            if not r:
                return r
            p = r.find(till)
        return r

    def cancel(self):
        self.do_run = False
        try:
            self.socket.close()
        except:
            pass
        self.socket = None

    def run(self):
        while self.do_run:
            request_s = self.receive_till("\n")
            if not request_s:
                break

            try:
                request = json.loads(request_s)
            except:
                continue # skip invalid requests

            method = None
            response = {}
            try:
                response["id"] = request["id"]
                method = request["method"]
            except:
                continue # skip invalid requests (id and method are required)

            print "==>%s" % request_s.replace("\n","\\n")

            if method == "miner_getstat1":
                result_strings = []
                result_strings.append("ethminer-xx.yy") # version
                result_strings.append("60") # runtime in minutes
                result_strings.append("185128;6981;0") # hashrate ETH [kH/s] /shares/..
                result_strings.append("22264;22990;23312;23232;23635;22990;23070;23635") # hashrate per gpu in kH/s
                result_strings.append("0;0;0")
                result_strings.append("off;off;off;off;off;off;off;off")
                result_strings.append("63;98;63;79;63;43;63;62;63;42;63;73;63;84;63;52") # fan and temps
                result_strings.append("testpool.xyz:1234")
                result_strings.append("0;0;0;0")
                response["jsonrpc"] = "2.0"
                response["result"] = result_strings

                response_s = json.dumps(response) + "\n"

                globs.error_after_x_requests -= 1
                if globs.error_after_x_requests > 0:
                    self.socket.send(response_s)
                    print "<==%s" % response_s.replace("\n","\\n")
                else:
                    print "Simulating frozen miner !"
                    while self.do_run:
                        time.sleep(1) # now do nothing, keep connection open
                    return
        try:
            self.socket.close()
        except:
            pass
        print "Connection closed"

def main(argv):
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.bind((globs.HOST, globs.PORT))
    s.listen(1)
    print "Listening on port %d" % globs.PORT

    while True:
        try:
            (sock, (ip, port)) = s.accept()
            print "Connected by %s:%d" % (ip, port)
            newthread = RigApiThread(ip, port, sock)
            globs.threads.append(newthread)
            newthread.start()

        except KeyboardInterrupt:
            break

    for t in globs.threads:
        t.cancel()

    for t in globs.threads:
        t.join()

    s.close()

if __name__ == "__main__":
    sys.exit(main(sys.argv))
rodneymo commented 6 years ago

Thanks for this!!!! I will look into it tomorrow