moopless / stratum-mining-litecoin

This fork is no longer maintained. Please use https://github.com/ahmedbodi/stratum-mining
Other
38 stars 35 forks source link

INFO template_registry # Job id 'fd1' not found #69

Open ghost opened 10 years ago

ghost commented 10 years ago

Hi

I just started using stratum (upgraded my pool from pushpoold), and I have been very happy with it. It is an amazing product for sure. I had run into an issue today when a user came to the litecoin pool with over 30MHs. Most of their shares were rejected and this error kept showing up in my stratum.log file.

2013-12-02 15:16:05,212 INFO template_registry # Job id 'fd1' not found 2013-12-02 15:16:05,212 INFO interfaces # False (0) INVALID user.1

any idea why this happened, and what i can do to fix it so they can mine again?

Thanks

CaptainAK commented 10 years ago

I am experiencing this issue as well. Anyone have any ideas?

TheSerapher commented 10 years ago

Mining with SHA and not scrypt?

CaptainAK commented 10 years ago

I wish it was that easy :) No I have watched the logs and the same worker will submit many valid shares and then suddenly start submitting shares where we get the "Job id not found" error. And it is not just the fast workers, that have the issue. It seems to be pool wide when the pool hashrate is high and even more so when there are several users with very high individual hashrate.

It appears to be more of a issues where Stratum is not notifying workers of a block/job change quickly enough, and the workers are turning in work for old (no longer existent) jobs. Could this be solved by speeding up the merkle refresh rate or some other tuning?

obigal commented 10 years ago

I've seen this as well in the last few weeks, weird as hell. Doesn't make any sense, seems at best random, different pools at different times, but consistent. I've had more odd stuff happen in the last few weeks than I care to recall, something isn't right.

scratchy commented 10 years ago

i have several occurencis of this "job not found problem" in my pool does anyone have a fix to this?

ahmedbodi commented 10 years ago

i reckon job based work would solve this but no ones taken up the 70 LTC bounty to fix it yet

obigal commented 10 years ago

I didn't know there was a bounty, now I'm intrigued. What exactly are we trying to fix?

' Job id not found ' is a normal occurrence on new blocks, there is no fixing that. Period.

In my situation, apparently there was never a new block notification received by my miner (cgminer) and it continues to hash old work, the pool obviously knows about the change because it is rejecting the shares since it has already moved on to the next block. I use the block notify script with PREVHASH_REFRESH_INTERVAL = 60 What I have been trying to figure out is where the fault is.

  1. Did the daemon even try to notify stratum?
  2. Did the daemon send the notification and stratum never received it?
  3. Did stratum receive the notification but choked on it and never broadcasted update?
  4. Did stratum received notification and then broadcast new block but it never made it to the miner?

@ahmedbodi when you say job based what exactly do you mean? Stratum is job based already so I'm a little confused.

ahmedbodi commented 10 years ago

its not its block based atm. join the #unitedminers-2 channel on freenode and everyone will explain its easier that way

Fredyy90 commented 10 years ago

I have the same problem, stratum was running for hours without problems, but now a lot Job id 'XX' not found messages appear in my log.

Also Exceptions like this started to appear: 2013-12-11 06:23:11,408 ERROR protocol # Processing of message failed Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/stratum-0.2.12-py2.7.egg/stratum/protocol.py", line 185, in dataReceived self.lineReceived(line, request_counter) File "/usr/local/lib/python2.7/dist-packages/stratum-0.2.12-py2.7.egg/stratum/protocol.py", line 216, in lineReceived raise custom_exceptions.ProtocolException("Cannot decode message '%s'" % line) ProtocolException: Cannot decode message ''

Lexazan commented 10 years ago

Same problem here. It always starts with this exception:

2013-12-13 10:45:28,466 INFO block_updater # Checking for new block.
2013-12-13 10:45:28,473 INFO block_updater # New block! Prevhash: b5c70ae1f142b11b7d2e8a15b1dbdf7b85e304d705465629b37a8c99d4375ab2
2013-12-13 10:45:28,481 INFO template_registry # New template for b5c70ae1f142b11b7d2e8a15b1dbdf7b85e304d705465629b37a8c99d4375ab2
2013-12-13 10:45:28,486 ERROR template_registry # [Failure instance: Traceback: : 'NoneType' object has no attribute 'write'
/usr/lib/python2.7/dist-packages/twisted/internet/defer.py:1095:gotResult
/usr/lib/python2.7/dist-packages/twisted/internet/defer.py:1082:_inlineCallbacks
/usr/lib/python2.7/dist-packages/twisted/internet/defer.py:362:callback
/usr/lib/python2.7/dist-packages/twisted/internet/defer.py:458:_startRunCallbacks
---  ---
/usr/lib/python2.7/dist-packages/twisted/internet/defer.py:545:_runCallbacks
/home/---/stratum-mining-cnc/lib/template_registry.py:133:_update_block
/home/---/stratum-mining-cnc/lib/template_registry.py:103:add_template
/home/---/stratum-mining-cnc/mining/subscription.py:26:on_template
/usr/local/lib/python2.7/dist-packages/stratum-0.2.12-py2.7.egg/stratum/pubsub.py:54:emit
/usr/local/lib/python2.7/dist-packages/stratum-0.2.12-py2.7.egg/stratum/pubsub.py:160:emit
/usr/local/lib/python2.7/dist-packages/stratum-0.2.12-py2.7.egg/stratum/pubsub.py:66:emit_single
/usr/local/lib/python2.7/dist-packages/stratum-0.2.12-py2.7.egg/stratum/protocol.py:102:writeJsonRequest
/usr/local/lib/python2.7/dist-packages/stratum-0.2.12-py2.7.egg/stratum/protocol.py:83:transport_write
]

and then goes "Job id not found"

2013-12-13 10:45:28,557 INFO template_registry # Job id 'e4' not found
2013-12-13 10:45:28,557 INFO interfaces # False (0) INVALID some_worker
2013-12-13 10:45:28,637 INFO interfaces # 0000008182b4afba63a1ead5c430c0344ee56e6bf8f72fc8dab401987d80f4aa (506) valid some_worker.1
2013-12-13 10:45:28,766 INFO template_registry # Job id 'e0' not found
2013-12-13 10:45:28,766 INFO interfaces # False (0) INVALID some_worker.4
2013-12-13 10:45:28,840 INFO interfaces # 0000006ff28b8258966428222a5d74bcc9014a624c3fe5a8c84cd4e805bf7008 (585) valid some_worker.4
2013-12-13 10:45:28,870 INFO template_registry # Job id 'e4' not found
2013-12-13 10:45:28,871 INFO interfaces # False (0) INVALID some_worker.4
2013-12-13 10:45:28,996 INFO template_registry # Job id 'e0' not found
2013-12-13 10:45:28,996 INFO interfaces # False (0) INVALID some_worker.5
2013-12-13 10:45:29,099 INFO interfaces # 000001be10113b5287df4ba800f88be56efb66eadcb5ac9fbd8e336cd586cb29 (146) valid some_worker.3
2013-12-13 10:45:29,217 INFO template_registry # Job id 'e0' not found
2013-12-13 10:45:29,218 INFO interfaces # False (0) INVALID some_worker.1
2013-12-13 10:45:29,229 INFO template_registry # Job id 'e4' not found
2013-12-13 10:45:29,229 INFO interfaces # False (0) INVALID some_worker.worker2
2013-12-13 10:45:29,278 INFO template_registry # Job id 'e0' not found
2013-12-13 10:45:29,278 INFO interfaces # False (0) INVALID some_worker.30
2013-12-13 10:45:29,304 INFO template_registry # Job id 'e4' not found
2013-12-13 10:45:29,304 INFO interfaces # False (0) INVALID some_worker
2013-12-13 10:45:29,327 INFO template_registry # Job id 'e0' not found

It may last for some seconds or more than 10 minutes, with random appearance interval.

Weird thing is that some of the workers are submitting valid shares at the same time. It looks like it is triggered when several blocks are found in a very short time interval, like several seconds.

Stratum restart seems to fix it for some random time. (Just because jobs are recreated)

My assumption is that jobs are being deleted before workers get notified.

helderdb commented 8 years ago

Same problem here. Makes my pool somewhat unstable...