JayDDee / cpuminer-opt

Optimized multi algo CPU miner
Other
763 stars 541 forks source link

[Question] Is full thread restart actually needed in solo mode? #400

Closed YetAnotherRussian closed 9 months ago

YetAnotherRussian commented 9 months ago

cpuminer --scantime=N -t 1 --cpu-affinity 1

Depending on block target time (e.g. 5...300sec between all existing PoW coins) we decide the setting above, and new work is being scanned once per N seconds.

If we set scantime=30:

image

If we set scantime=1:

image

Evident hashrate (total hashes / total time) loss. If blocktime is 5s we shouldn't set >5s to avoid mining stales.

That's our "job" response which doesn't change:

... "target": "000000a633000000000000000000000000000000000000000000000000000000", "mintime": 1695195517, "noncerange": "00000000ffffffff", "curtime": 1695196133, "height": 1234567 ...

And that's our log:

image

So why should we restart our threads each time wasting time, instead of making restart only if block height changed or job data changed?

This strategy leads to a signifaicant hashrate loss in multiminer (fork of your repo with partial GPU support), too.

Other problem is that hardware components do not like "waves" of temperature caused by TDP drop in these pauses (yep there is no way to remove them all as we must restart on a new block height).

I see a way to store previous response data in memory and compare it to a new response. If no changes in block height and job => no restart until new block height or new job data.

What do you think about adding a new issue called "discussion" and leaving it forever? Some things may be just asked&answered w/o issue bureaucracy...

JayDDee commented 9 months ago

This is going to require some thought. With stratum we expect the server to interrupt when new work arrives. This is the main intent of restart_threads, the signal to the miner threads to fetch the new work.Otherwise the mining threds keep their heads down and just keep hashing. The only other trigger if the thread runs out of nonces and has to increment extranonce2.

With solo it's different, there's no stratum thread to signal the mining threads about new work, it's up to the mining threads themselves to poll the server for new work. This does open a window for stale work depending on the scan time.

I optimized for stratum which may have negatively affected solo. The optimization was to reduce over head in preparing new work that is the same as the old work. The improvement is measurable.

At this time each solo mining thread acts independantly to poll the server. The first one to find new work prepares it, logs New Work and signals the other mining threads via restart_threads but there's no associated New Work log.

I'll investigate further.

JayDDee commented 9 months ago

Currently restart_threads is called every scantime, new work or not as I stated above. I can change that to only do it with new work.

The scantime issue is a little more complicated. The optimum scantime depends on the "new work" time, That will vary with block time and hashrate and seems to be a tuning exercise for the user. I don't know if the miner is smart enough to autotune it.

YetAnotherRussian commented 9 months ago

The scantime issue is a little more complicated. The optimum scantime depends on the "new work" time, That will vary with block time and hashrate and seems to be a tuning exercise for the user.

At least three more things:

  1. Ping penalty, and I'm pretty sure some >2ms localhost ping times are due to CPU being busy (this is pretty logical in case of CPU mining)
  2. RPC server "work" in case of being hosted on a mining machine as well. I've seen up to 50% load of one core which was produced by json-rpc server of a coin in case of "1 < scantime < 3". This affects hashrate by up to 15% and is useless.
  3. Turbo boost/turbo core status (enabled/disabled/locked) and power plan. May loose up to 50-70 ms for each cycle "get job - stop - restart threads" in case of being enabled but not locked or balanced/power saver plan being active... Time to go from e.g. 1.5Ghz to e.g. 4Ghz and pin to best cores if affinity unused.

I get best results using "blocktime / 15 < scantime < blocktime / 10". Current implementation makes low values (incl. the default one) a bit ineffective, but they should be. Ideally, this time should be as low as possible while maintaining fig. 1-3.

I can change that to only do it with new work.

Seems to be a good idea but maybe should be put into a separate --scan-jobs option or something like this. To avoid breaking current logic.

UPD. I've also found a small issue which is not handled (or not known?) in cpuminer-opt. It is ntime handling. If coin PRC server time is e.g. 23:00 while client (cpuminer-opt) time is 22:00, your blocks won't be accepted, ever. Some people may get bored by rejects using old Windows versions or mining between different OS.

2023-09-05 16:19:35 ERROR: CheckProofOfWork(): hash doesn't match nBits 2023-09-05 16:19:35 ERROR: CheckProofOfWork : non-AUX proof of work failed, hash=000000000c76c3df09f35de958951425649f24aad5f3c2f34738da1d4ca4d0cd, algo=3, nVersion=614663684, PoWHash=418cbda26e4946b5e8bf331d9cac433141c395d11465bbada79e605e9a2022aa 2023-09-05 16:19:35 ERROR: ProcessNewBlock: AcceptBlock FAILED

Not sure if something could be done here (check current system time against job's time?), just FYI.

JayDDee commented 9 months ago

Use of scantime is correct, no new option is needed. Most coins have a block time of 5 mins, it's hard to find a default that works for every coin. Like I said I don't think the miner is smart enough to autotune.

Is ntime issue about specific times or any difference? Ntime is part of the blockheader (work) sent by the server and a stale ntime is a stale share. There is no sychcronizing with system time by the miner, it's used by the server to match the submit with the provided data, like a job check in stratum. But the error you posted was nbits. Nbits is used to set the target difficulty and target hash with stratum. With solo the miner ignores nbits as the hash target is provided by the server. The miner then converts the hash target to a difficulty target. I don't know how nbits has any effect solo. With stratum nbits only changes when the stratum diff changes. Maybe it's just another indication of stale work?

JayDDee commented 9 months ago

What do you think about adding a new issue called "discussion" and leaving it forever? Some things may be just asked&answered w/o issue bureaucracy...

Bitcointalk is a good place for that.

https://bitcointalk.org/index.php?topic=5226770.msg53865575#msg53865575

JayDDee commented 9 months ago
3. Turbo boost/turbo core status (enabled/disabled/locked) and power plan. May loose up to 50-70 ms for each cycle "get job - stop - restart threads" in case of being enabled but not locked or balanced/power saver plan being active... Time to go from e.g. 1.5Ghz to e.g. 4Ghz and pin to best cores if affinity unused.

This is the tradeoff with stale, checking for it is disruptive and requires a significant amount of work including merkl hash, prehash and other administrative work like ckecking conditional mining and gathering stats. And due to mutex all mining threads have to wait for this work to be done. The correct balance is to check only as frequently as necessary to prevent stale work.

JayDDee commented 9 months ago

I'm preparing another release that will remove the thread restart flood, plus some other stuff. Last call for anything else that needs attention.

JayDDee commented 9 months ago

cpuminer-opt-3.23.3 is released.

Edit: corrected version typo.

YetAnotherRussian commented 9 months ago

cpuminer-opt-3.23.3 is released.

I'm on it. Seems to be OK except max diff option:

image

But it just may go to a next release, if needed.

JayDDee commented 9 months ago

This is a good thing. Threads restart every 5s while the miner is paused to retest the condition. It has nothing to do with polling for new work.

Clarification: The thread restarts are not triggered by polling for new work, but by conditional mining. However, new work is fetched as a result of the thread restart.