selaux / miner-dashboard

Node.js based app to show the current status of your miner in a browser.
37 stars 15 forks source link

Minerdash runs but then falls over #50

Open drak42 opened 10 years ago

drak42 commented 10 years ago

Hi,

I got it running and have started adding hosts.

After a while it falls over with this:

2014-05-13T21:21:45.183Z - info: 222.154.249.121 - GET / HTTP/1.1 Error: EMFILE, readdir '/opt/miner-dashboard/frontend/views/partials' glob error { [Error: EMFILE, readdir '/opt/miner-dashboard/frontend/views/partials'] errno: 20, code: 'EMFILE', path: '/opt/miner-dashboard/frontend/views/partials' }

In the browser i see this: Error: EMFILE, readdir '/opt/miner-dashboard/frontend/views/partials'

Also graphs don't seem to be working.

Chris

drak42 commented 10 years ago

2014-05-13T23:53:12.733Z - info: miner2 - error fetching data code=EMFILE, errno=EMFILE, syscall=connect 2014-05-13T23:53:12.734Z - info: miner9 - error fetching data code=EMFILE, errno=EMFILE, syscall=connect 2014-05-13T23:53:12.735Z - info: miner3 - error fetching data code=EMFILE, errno=EMFILE, syscall=connect 2014-05-13T23:53:12.736Z - info: miner5 - error fetching data code=EMFILE, errno=EMFILE, syscall=connect 2014-05-13T23:53:12.736Z - info: miner6 - error fetching data code=EMFILE, errno=EMFILE, syscall=connect 2014-05-13T23:53:12.738Z - info: miner7 - error fetching data code=EMFILE, errno=EMFILE, syscall=connect

selaux commented 10 years ago

Just create the /opt/miner-dashboard/frontend/views/partials directory, it doesn't seem to be included in the zip.

drak42 commented 10 years ago

Hi,

Done that, app still runs for a while then starts loosing connection to devices it seems..

2014-05-14T08:56:26.549Z - info: miner10 - error fetching data code=ETIMEDOUT, errno=ETIMEDOUT, syscall=connect 2014-05-14T08:56:26.549Z - info: miner2 - error fetching data code=ETIMEDOUT, errno=ETIMEDOUT, syscall=connect 2014-05-14T08:56:26.550Z - info: miner9 - error fetching data code=ETIMEDOUT, errno=ETIMEDOUT, syscall=connect 2014-05-14T08:56:26.550Z - info: miner5 - error fetching data code=ETIMEDOUT, errno=ETIMEDOUT, syscall=connect 2014-05-14T08:56:26.551Z - info: miner8 - error fetching data code=ETIMEDOUT, errno=ETIMEDOUT, syscall=connect

They are all on the same network, two segments though, only thing different is I reference them at the firewall external ip and port which I then NAT internally to each miner, allows me to add remote boxes' in the same way

selaux commented 10 years ago

Can you try to use one of the example scripts of cgminer (i.e. https://github.com/ckolivas/cgminer/blob/master/api-example.py) to check wether it is an issue with miner-dashboard or the miner is just not reachable anymore?

drak42 commented 10 years ago

Bit of a script kiddie here...

How would I call this to run it and what would I need to change?

selaux commented 10 years ago

Download the script to the host where miner-dashboard is running, then run (Replace IP and Port if necessary)

python2 api-example.py summary 10.153.210.1 4028

This should give you an output like

{u'STATUS': [{u'STATUS': u'S', u'Msg': u'Summary', u'Code': 11, u'When': 1400059950, u'Description': u'cgminer 4.3.0'}], u'id': 1, u'SUMMARY': [{u'Difficulty Accepted': 112591.0, u'Pool Rejected%': 0.0053, u'Found Blocks': 0, u'Difficulty Rejected': 6.0, u'MHS 15m': 2987.56, u'Device Rejected%': 0.0053, u'Pool Stale%': 0.0843, u'Work Utility': 41.26, u'Rejected': 2, u'Elapsed': 163798, u'Hardware Errors': 975, u'Accepted': 31079, u'Network Blocks': 374, u'Local Work': 522033, u'Get Failures': 4, u'Difficulty Stale': 95.0, u'Total MH': 489078591.0, u'Device Hardware%': 0.8581, u'Discarded': 319983, u'Stale': 25, u'MHS av': 2985.86, u'Getworks': 5937, u'MHS 5s': 3199.98, u'Best Share': 58023, u'MHS 1m': 3018.63, u'MHS 5m': 2992.61, u'Last getwork': 1400059950, u'Remote Failures': 0, u'Utility': 11.38}]}
drak42 commented 10 years ago

Here you go

sudo python api-example.py summary x.x.x.x 4035 {u'STATUS': [{u'STATUS': u'S', u'Msg': u'Summary', u'Code': 11, u'When': 1400060131, u'Description': u'cgminer 3.7.2'}], u'id': 1, u'SUMMARY': [{u'Difficulty Accepted': 17885.913479819999, u'Pool Rejected%': 2.0135999999999998, u'Found Blocks': 0, u'Difficulty Rejected': 367.55075749999997, u'Device Rejected%': 46.941299999999998, u'Pool Stale%': 0.0, u'Work Utility': 13.199999999999999, u'Rejected': 16, u'Elapsed': 3558, u'Hardware Errors': 117, u'Accepted': 767, u'Network Blocks': 52, u'Local Work': 1659, u'Get Failures': 0, u'Difficulty Stale': 0.0, u'Total MH': 1359.1251999999999, u'Device Hardware%': 13.0, u'Discarded': 836, u'Stale': 0, u'MHS av': 0.38, u'Getworks': 417, u'MHS 5s': 0.38, u'Best Share': 921033, u'Remote Failures': 0, u'Utility': 12.93}]}

drak42 commented 10 years ago

As i said, npm start works fine for a while, then suddenly seems to loose connectivity

drak42 commented 10 years ago

had no issues with version 2 though...

selaux commented 10 years ago

Hm, I didn't change anything having to do with polling the miner status from 0.2.0 to 0.3.0.

Now to get some more information:

selaux commented 10 years ago

PS: The issue might have been there before, the logging is a new thing.

selaux commented 10 years ago

Another thing: Do all connections fail? Do you get any updated timestamps in the dashboard?

drak42 commented 10 years ago

Will get on to getting those details for you shortly, yes all connections fail. Runs perfectly for a few minutes then seems to loose all connections, sometimes a few come back then they drop off again to

drak42 commented 10 years ago

Hi,

Got some time to do a few tests.

  1. took about 2 minutes to fall over and loose connections to all devices.
  2. Issuing the API command to a device still returns data with no problems
  3. Here is some output for you: 36 (SYN_SENT) node 3610 root 1013u IPv4 51342215 0t0 TCP BlackBOX.fritz.box:50053->x.86.204.y.static.snap.net.nz:44036 (SYN_SENT) node 3610 root 1014u IPv4 51342216 0t0 TCP BlackBOX.fritz.box:46318->x.86.204.y.static.snap.net.nz:44037 (SYN_SENT) node 3610 root 1015u IPv4 51342217 0t0 TCP BlackBOX.fritz.box:46319->x.86.204.y.static.snap.net.nz:44037 (SYN_SENT) node 3610 root 1016u IPv4 51342218 0t0 TCP BlackBOX.fritz.box:46320->x.86.204.y.static.snap.net.nz:44037 (SYN_SENT) node 3610 root 1017u IPv4 51342604 0t0 TCP BlackBOX.fritz.box:36358->x.86.204.y.static.snap.net.nz:44038 (SYN_SENT) node 3610 root 1018u IPv4 51342605 0t0 TCP BlackBOX.fritz.box:36359->x.86.204.y.static.snap.net.nz:44038 (SYN_SENT) node 3610 root 1019u IPv4 51342606 0t0 TCP BlackBOX.fritz.box:36360->x.86.204.y.static.snap.net.nz:44038 (SYN_SENT) node 3610 root 1020u IPv4 51342607 0t0 TCP BlackBOX.fritz.box:47004->x.86.204.y.static.snap.net.nz:44039 (SYN_SENT) node 3610 root 1021u IPv4 51342608 0t0 TCP BlackBOX.fritz.box:47005->x.86.204.y.static.snap.net.nz:44039 (SYN_SENT) node 3610 root 1022u IPv4 51342609 0t0 TCP BlackBOX.fritz.box:47006->x.86.204.y.static.snap.net.nz:44039 (SYN_SENT) node 3610 root 1023u IPv4 51342306 0t0 TCP BlackBOX.fritz.box:40078->.86.204.y.static.snap.net.nz:44031 (SYN_SENT)

As I mentioned I am doing port NAT'ing at a firewall level to access units in different networks, The all exist behind my public IP

Thanks

drak42 commented 10 years ago

Example of my configs: id: 'miner1', module: 'miners/bfgminer', title: 'Rock Solid Miner 1 - Dual Sappihre R9 270x', host: '203.86.204.25', port: 44030

Port 44030 on my firewall NAT's to port 4030 on a device internally on a 192.168.1.x range

selaux commented 10 years ago

Can you try the current master? I tried a fix.

drak42 commented 10 years ago

I did a git pull update, hope that's ok. Same thing, runs perfectly for a while then falls over.

Get these errors in log when trying to refresh the browser when it fails.

2014-05-19T06:23:39.705Z - info: 192.168.1.102 - GET / HTTP/1.1 Error: EMFILE, open '/opt/miner-dashboard/frontend/views/index.hbs' glob error { [Error: EMFILE, readdir '/opt/miner-dashboard/frontend/views'] errno: 20, code: 'EMFILE', path: '/opt/miner-dashboard/frontend/views' } 2014-05-19T06:23:40.396Z - info: miner1 - error fetching miner data Error: connect EMFILE 2014-05-19T06:23:40.399Z - info: miner10 - error fetching miner data Error: connect EMFILE

In the browser I get this:

Error: EMFILE, open '/opt/miner-dashboard/frontend/views/index.hbs'

I will try a clean installation, also I get the following in npm update/install. Not sure if they mean anything.

npm WARN engine hawk@0.10.2: wanted: {"node":"0.8.x"} (current: {"node":"v0.10.28","npm":"1.4.9"})

npm WARN unmet dependency /opt/miner-dashboard/node_modules/grunt-browserify requires async@'~0.7.0' but will load npm WARN unmet dependency /opt/miner-dashboard/node_modules/async, npm WARN unmet dependency which is version 0.8.0 npm WARN unmet dependency /opt/miner-dashboard/node_modules/grunt/node_modules/js-yaml/node_modules/argparse requires underscore.string@'~2.3.1' but will load npm WARN unmet dependency /opt/miner-dashboard/node_modules/grunt/node_modules/underscore.string, npm WARN unmet dependency which is version 2.2.1 npm WARN unmet dependency /opt/miner-dashboard/node_modules/handlebars/node_modules/uglify-js requires async@'~0.2.6' but will load npm WARN unmet dependency /opt/miner-dashboard/node_modules/async, npm WARN unmet dependency which is version 0.8.0

npm WARN engine cryptiles@0.1.3: wanted: {"node":"0.8.x"} (current: {"node":"v0.10.28","npm":"1.4.9"}) npm WARN engine sntp@0.1.4: wanted: {"node":"0.8.x"} (current: {"node":"v0.10.28","npm":"1.4.9"}) npm WARN engine hoek@0.7.6: wanted: {"node":"0.8.x"} (current: {"node":"v0.10.28","npm":"1.4.9"}) npm WARN engine boom@0.3.8: wanted: {"node":"0.8.x"} (current: {"node":"v0.10.28","npm":"1.4.9"})

npm WARN optional dep failed, continuing fsevents@0.2.0

selaux commented 10 years ago

I'm out of ideas. It looks like the connections to cgminer cannot be opened or closed correctly and you eventually run out of file descriptors that you are allowed to open. You could increase the limit via ulimit, but that will just delay the errors. I'll leave this open, maybe I'll get some ideas in the future.

selaux commented 10 years ago

Allright, it did't let me go. After some tests with your miners :wink:, I think we have two issues:

Let me know if there are any news with current master.

NB: You might want to increase the interval the miners are polled for such an amount of miners (the default is every second to keep the frontend responsive). I think something around 5 seconds would be better (less traffic, almost the same value).

drak42 commented 10 years ago

Hey, thanks for all the help!

Been running for 10 minutes now so looking good :)

drak42 commented 10 years ago

spoke to soon...

I'm redoing the network next week to eliminate the firewall, I'll get back to you after that. Issue may be there then

selaux commented 10 years ago

Any update?

drak42 commented 10 years ago

Hey mate, just getting back to this. Problem still exists, wondering if it's not somethign relating to the antminers...

drak42 commented 10 years ago

I changed the default time value of 1000 in bfgminer.js to 5000 and it seems to be working