Closed borzaka closed 6 years ago
This is related to poor performance of Nicehash proxies. Contact them and let them know the issue that you are having with their pool.
Try XMR-Cast, someone else reported that as a fix earlier. If that works fine, then that's proof that it is a stak problem, and not a nicehash one.
I have the same issue. I went to https://www.nicehash.com/support and submitted a ticket. If someone tries XMR-cast and does not have an issue, please let us know.
Guys, Hi, i have the same problem as mentioned above, i'm mining with with 5 x vega 64 and sometimes I have this problem Hash rate dropped from 10249.0 H/s to H/s
Hash rate dropped from 10249.0 H/s to H/s Not a STAK error. JJs hash monitor is not supported.
What do you mean mate ? It happened again Restarting after 17 Hours 38 Min - Hash rate dropped from 10138.5 H/s to H/s
Your problem is probably caused by beta blockchain driver and/or too much overclock. The blockchain driver is very sensitive, especially with Vega. Common issue, I have it too, but not XMR-Stak related.
Have you found a resolution, @borzaka? Have same issue of many, many "Job not found", and well as a fair share of "Invalid nonce; is miner not compatible with NiceHash?", on my five-Vega rig. Miner reporting 9300 H/s, but after all these "Job not found" errors, calculated/pool-side hashrate is in the 7000s. Yikes.
Doesn't appear to be network/connectivity-related: a console ping
to the server comes back in 20ish ms, and shares are being accepted in (a reasonable) 75 - 80 ms range. No network connection errors being shown in xmr-stak. Had the "Job not found" errors on a couple XMR pools, but it was never as bad as it is with NiceHash's CryptoNight pools. (I've tried both USA and JP. Same issue.)
Would love to hear if anyone has tried switching to cast-xmr and had this problem go away. Then, at least, we'd know there's something to fix in xmr-stak, which it doesn't seem like there likely is. Indeed, I think NiceHash seems to be the problem.
Have contacted NiceHash by opening support ticket as @JerichoJones suggested. Not hopeful they'll investigate but we shall see.
I have also submitted a ticket to NiceHash. This is 100% NiceHash related. There is no problem mining to another CryptoNight pool with the same settings.
While this is Nicehash related, it is also xmr-stak related.
These are my 6xVega56 statistics and tomorrow I'll upload an evening of castXMR's data, but I guarantee it will be less than 4% in lost jobs AND devfee combined. castXMR patched and made available the --fastjobswitch
option which solved the vast majority of these issues (some small number of "job not found" errors will occur on any mining software due to bad luck and high difficulty of Nicehash).
I will add fastswitch soon. But the erros of the nicehash pool can not fixed within the miner.
I have run both XMR-STAK & CAST XMR for over 24 hours. CAST has an option --fastjobswitch mode that fixes the stale job errors. Error rate dropped ~1%. Without the option, error rate was about 4-5%
STAK had 4~6% error rate, similar to other users here.
@psychocrypt Looking forward to the fast switch mode! Thank you in advance!
Here's the results of running castXMR overnight, as promised:
Fastjobswitch
would be a huge addition. Thank you!
Just contributing more data - longer runtime, same machine, castXMR output.
Now compare the data against what the pool reports.
Reply from NiceHash support:
Hello!
We suggest you to use our official NiceHash Miner, which you can download from here: https://miner.nicehash.com
All other variants of mining software may not be compatible with NiceHash due to some protocol specifics.
Kind regards, NiceHash Team
Switched to Cast XMR with --fastjobswitch and getting much, much fewer invalid results. Pool-side/calculated hashrate is about 4 - 6% higher now vs. with xmr-stak (and all the "Job not found" errors). No choice but to switch to Cast with my 6-Vega rig. :(
same here.. went from ~20-25% failure rate that I was getting with XMR-Stak down to 99%+ with Cast... something is up, it only started recently too.. Problems in 2 vega rigs with stak :(
@borzaka I've been using the official NiceHash Miner (NHML 1.8.2.0) and honestly is as bad if not worse than the latest version of most of the miners included with their software.
I got the same reply VERBATIM... with an addition stating that they would look into the matter and get back to me when they found a cause.
On average I'm showing 4-5% errors on the miner side...
Difficulty : 400015
Good results : 2313 / 2405 (96.2 %)
Avg result time : 22.3 sec
Pool-side hashes : 5400191
Top 10 best results found:
| 0 | 1821927604 | 1 | 358736341 |
| 2 | 177323025 | 3 | 174301948 |
| 4 | 165253070 | 5 | 140589606 |
| 6 | 137797074 | 7 | 88931207 |
| 8 | 82396103 | 9 | 81954457 |
Error details:
| Count | Error text | Last seen |
| 85 | Job not found. | 2018-01-24 23:39:15 |
| 6 | Invalid nonce; is miner not comp | 2018-01-24 10:00:46 |
| 1 | AMD Invalid Result | 2018-01-24 16:34:25 |
As you can see, multiple times throughout the day my profitability is down to ZERO (0.0000) BTC. Mind you, usually i get 5+ of those 0 points... 2-3 the last 24hours is a bit of a blessing.
I've contacted customer support three (3) times over the course of 2 weeks with one reply back saying they would look into it and get back to me.
Hi. I have been using the software with nicehash too. I was also getting those dips. I would get periods of time where transfers would drop low, and I check my log with series of disconnects over the course of 45 mintues some times.
I tried changing my software so many times in 2 weeks, I can not count. I got frustrated so I explored to use xmr-stak independantly without the nicehash front end.
I ran tests on my smaller machines connecting to other various xmr monero pools, to compare. I wanted to see how their system statistics -vs- my internal system. When xmr-stak is run on servers that are NOT nicehash; I do NOT get the drop-valleys, reports of socket disconnects, and/or periods long unusual latent pool has accepts.
The "average result time" on every other xmr pool I have tested so far was below 30 seconds. "average result time" for all my higher end machines using nicehash is well over 130 seconds, and sometimes 200, 300, seconds. When I find 500+ seconds I reset xmr-stak, and try and find the problem. I try to analyze but have nothing conclusive - besides a theory for disruption in service from my site to the nicehash server pool.
"Average result time" has everything to do with the difficulty. Nicehash's static difficulty on CryptoNight starts at 200,000 which is well, WELL above vardiff for other pools. This is why it takes very long to get a result, and why sometimes when you do, you submit a bad share. This is very likely related to fast job switching, not Nicehash directly.
I have validated that castXMR poolside submits all clean shares with no issues when --fastjobswitch
is enabled, but xmr-stak has the same problem castXMR used to have before that option existed (3-5% stale shares).
Recent mining details using castXMR on a long run.
for all nice hash miner #978 is also interesting. We had a very small nonce overlap. This results in a very very small possibility that duplicated shares can be created.
Thanks, will test on the next commit.
I have the same issue (Job not found
errors around 5%). I've been mining on NiceHash's cryptonight EU pool but tested other NiceHash pools with the same or worse results. I do believe this is partly an issue on NiceHash's end because I am experiencing a lot of disconnects (socket errors) while using a NiceHash pool. When mining ETN or monero on a nanopool server I don't get stale jobs or experience any disconnects.
As @kyleboddy already pointed out:
I have validated that castXMR poolside submits all clean shares with no issues when --fastjobswitch is enabled, but xmr-stak has the same problem castXMR used to have before that option existed (3-5% stale shares).
Is there a way to implement this setting (from castXMR) into xmr-stak? I had hoped that #977 or #978 would help solve this issue but #977 is closed and it looks like #978 will be closed soon.
Has anyone had success with the code changes in #977 or #978?
I'am currently using the latest dev branch, compiled on yesterday (01.26.).
After ~8 hours of mining to NiceHash:
RESULT REPORT
Difficulty : 200007
Good results : 612 / 635 (96.4 %)
Avg result time : 48.2 sec
Pool-side hashes : 198207386
Top 10 best results found:
| 0 | 219852473 | 1 | 192800281 |
| 2 | 124500090 | 3 | 90206818 |
| 4 | 56914121 | 5 | 49705601 |
| 6 | 34597221 | 7 | 32560097 |
| 8 | 29064942 | 9 | 28624119 |
Error details:
| Count | Error text | Last seen |
| 21 | Job not found. | 2018-01-27 12:30:51 |
| 2 | AMD Invalid Result GPU ID 1 | 2018-01-27 06:51:59 |
No more Invalid nonce, only Job not founds, but it's way more than with the latest release version (~16 hours, 27 Job not found).
AMD Invalid Result GPU ID 1
is because of too much overclock, I know.
Update: After another ~6 hours of mining the Invalid nonce is back:
RESULT REPORT
Difficulty : 400015
Good results : 500 / 521 (96.0 %)
Avg result time : 47.2 sec
Pool-side hashes : 155557138
Top 10 best results found:
| 0 | 1094365433 | 1 | 410295691 |
| 2 | 248598190 | 3 | 213010633 |
| 4 | 40915004 | 5 | 32428932 |
| 6 | 22759348 | 7 | 17998561 |
| 8 | 17237854 | 9 | 15058986 |
Error details:
| Count | Error text | Last seen |
| 19 | Job not found. | 2018-01-28 02:03:40 |
| 2 | Invalid nonce; is miner not comp | 2018-01-28 00:10:21 |
Still with latest dev branch, compiled on yesterday (01.26.).
@psychocrypt do you have any updates on your test run with #977? Thanks.
Compiled the latest version from dev branch which includes #996 today. xmr-stak running for about 9 hours now without "invalid nonce" errors. Thank you for your hard work @psychocrypt & @fireice-uk.
RESULT REPORT
Difficulty : 200007
Good results : 406 / 417 (97.4 %)
Avg result time : 98.1 sec
Pool-side hashes : 65602296
[...]
Error details:
| Count | Error text | Last seen |
| 9 | Job not found. | 2018-01-30 19:54:34 |
| 2 | AMD Invalid Result GPU ID 0 | 2018-01-30 19:36:25 |
I'am still having Invalid nonce error with the latest dev branch compiled on yesterday (01.30.):
After ~7,5 hours mining to NiceHash:
HASHRATE REPORT - AMD
| ID | 10s | 60s | 15m | ID | 10s | 60s | 15m |
| 0 | 1023.3 | 1022.2 | 1021.3 | 1 | 1021.0 | 1022.0 | 1021.3 |
| 2 | 1016.8 | 1021.1 | 1022.8 | 3 | 1026.7 | 1022.6 | 1022.9 |
| 4 | 939.5 | 939.6 | 939.5 | 5 | 936.4 | 939.2 | 939.5 |
Totals (AMD): 5963.7 5966.7 5967.3 H/s
-----------------------------------------------------------------
Totals (ALL): 6545.4 6590.5 6598.4 H/s
Highest: 6625.0 H/s
-----------------------------------------------------------------
RESULT REPORT
Difficulty : 400015
Good results : 545 / 559 (97.5 %)
Avg result time : 48.4 sec
Pool-side hashes : 159806669
Top 10 best results found:
| 0 | 946095324 | 1 | 451201395 |
| 2 | 85282705 | 3 | 39346610 |
| 4 | 33953383 | 5 | 29997236 |
| 6 | 26507963 | 7 | 22740497 |
| 8 | 22330262 | 9 | 19314346 |
Error details:
| Count | Error text | Last seen |
| 1 | Invalid nonce; is miner not comp | 2018-01-31 09:32:54 |
| 13 | Job not found. | 2018-01-31 16:26:50 |
Me too but I get way fewer errors than before.
I am getting also fewer errors from the dev version than the last release. Running for ~12 hours now. None of them are Invalid nonce; Only "RECEIVE error: socket closed". Also from nicehash UI seems the rate has imrpoved and does not drop as much.
I'm running the latest dev. Here's my results from a 12 hour run. Getting a ton of job not founds and 1 nonce error. Not sure if job not founds are poolside related.
RESULT REPORT Difficulty : 200007 Good results : 546 / 574 (95.1 %) Avg result time : 72.3 sec Pool-side hashes : 84002947
Top 10 best results found: | 0 | 39347943 | 1 | 26166618 | | 2 | 22577700 | 3 | 19944564 | | 4 | 19102979 | 5 | 18865856 | | 6 | 17145834 | 7 | 13023604 | | 8 | 11178336 | 9 | 10409639 |
Error details: | Count | Error text | Last seen | | 7 | [NETWORK ERROR] | 2018-02-01 11:06:28 | | 20 | Job not found. | 2018-02-01 19:29:26 | | 1 | Invalid nonce; is miner not comp | 2018-02-01 10:55:24 |
@hchan123 What pool are you connected to? Getting a lot of "Job not found" and connection errors as well. I am connected to cryptonight.eu.nicehash.com:33355
.
@minerbird Check the issue Title. We are mining to NiceHash with this XMR-Stak CryptoNight miner.
After ~13 hours:
Error details:
| Count | Error text | Last seen |
| 24 | Job not found. | 2018-02-03 21:32:55 |
| 2 | Invalid nonce; is miner not comp | 2018-02-03 12:20:33 |
dev branch compiled on 01.30.
I also have this issue, although I'm running v2.0.0, it doesn't look like its been resolved in the latest dev versions. I'm wondering if there's anything else we can do to help the Devs resolve this?
I tested to mine against the eu proxy of nicehash with one rx570 and get no error job not found. Currently it is not possible to reproduce the issue.
@psychocrypt For how many hours have you tested? Because NiceHash's CrytoNight difficulty is very high, and therefore shares are submitted very rarely; the problem will come up later on slow machines. I have 6500 H/s, your RX 570 has around 700 H/s. Based on my statistics, you should get your first job not found after ~5 hours.
Just to add, my rigs run at 16 KH/s each. Logs from one of the rigs:
Job not found. 150 | 2018-02-05 09:24:16 Invalid nonce; is miner not compatible with NiceHash? 6 | 2018-02-04 09:29:34
Pool address | stratum+tcp://cryptonight.eu.nicehash.com:3355 Connected since: 2018-02-04 11:00:35 ping time : 81 ms
Come on @psychocrypt. There are a bunch of people telling you that they’re getting about 5% invalid results, myself included.
I see them in real-time in the error log. It always happens after a new job is detected. And the first result submitted after that sometimes is an invalid result. To my novice self, it seems like the mining software isn’t switching the new job fast enough, as it’s still trying to submit shares on the old job. Thoughts?
@minerbird I usually connect to cryptonight.usa..nicehash.com:33355 but also have cryptonight.eu.nicehash.com:33355 and cryptonight.uk.nicehash.com:33355 as failover.
After several hours I can now reproduce the issue. As some of you wrote the reason is that a share is send to the pool during or short after a new job is arriving. Normally this is shown as expired block but nicehash is reporting it under an not common error. The point is that there is no reason why the share of the previous job should not be valid and payed. This is only the case for pools with bad software.
@toekyoe The miner is not switching the job to slow. The reason for this "error" is that the last bunch of hashes is calculated on the gpu. There are possible solutions which throw away good hashes but will reduce this "error"
castxmr
is doing.What I will say, we can avoid this "error" in xmr-stak but this will be only a visual effect and will be equal with cheating ( it only beautify the statistics).
The argument that castxmr
is not having this issue is not correct: Nobody knows how the error counting is going on in castxmr
and the fastswitch
option is only an option a throw away good hashes.
If someone will open an POLL where the community can vote if the fast switch option should be implemented in xmr-stak
or not I will discuss it with @fireice-uk again.
It seems to me that nicehash have changed something around 20180210 18:00:00 CET.
Before I had around 2% "Job not found" plus some (a few) "Invalid nonce". Now I have only "Job not found" errors but it is a solid 4%. I'm doing 24 hours runs.
To me this change is clear on the nicehash plots; maybe it is something local to my setup maybe not.
cancel the current calculated bunch of hashes if a new job arrived The GPU is running 90% of the time (~2sec) one kernel, if we cancel the current job after the kernel and start direct with the new job we will only emulate good hashes without any benefit in time. This is what "fastswitch" in castxmr is doing.
@psychocrypt This part does not make sense to me "only emulate good hashes without any benefit in time", how can you emulate a good share? Either the share is good, or its bad. Emulating a bad share seems more plausible.
What I do not understand is, canceling the current job would still result in a ~2 sec run but no submitted stale share? Or cancelling the current job would drop the current work and immediately switch (at the cost of losing at most ~2 seconds on a stale share, average 1 second). But the logic is that stale share will be paid for by the pool.
So the use cases, correct me if I am wrong.
So to summarize, running to completion at worst will spend 2 seconds mining a stale share. Interrupting would at best save 2 seconds NOT mining a stale share? Interrupting at worst would waste ~1.99999 seconds mining the now stale share and drop the result?
So say we are mining nicehash, and the hashrate spiked fast,
you end up on a quick coin, say 15s seconds between blocks due to hashrate burst.
This is hypothetical worst case scenario. Hashing at 10,000 h/s.
4:25:0 BLOCK
4:25:2 Switched to new Block (2 sec lost mining stale, no share found)
4:25:15 BLOCK
4:25:17 Switched to new Block (2 sec lost mining stale, found share, NiceHash: Invalid Job)
4:25:30 BLOCK
4:25:32 Switched to new Block (2 sec lost mining stale, no share found)
4:25:45 BLOCK
4:25:47 Switched to new Block (2 sec lost mining stale, no share found)
4:25:57 BLOCK
4:25:59 Switched to new Block (2 sec lost mining stale, no share found)
Reported Hashrate 10,000 h/s
Real Hashrate mining shares that count 8,340 h/s
So in the last 60 seconds we had 5 blocks due to a nicehash hashrate burst.
We have 1000 rigs. Each rig got the new work.
We spent 10 seconds out of 60 mining stale shares,
we lost 16.7% of our hashrate technically.
To switch to new job with Cast XMR is about ~20-40 ms in my case:
[18:04:44] New job received. Avg Job Time: 49.0 sec
[18:04:44] GPU0: 26 ms needed to switch to new job
[18:04:44] GPU2: 34 ms needed to switch to new job
[18:04:44] GPU1: 37 ms needed to switch to new job
Not seconds!
- --fastjobswitch option (experimental) to force fast switching to new jobs. When enabled the GPUs will switch to the new job within less then 100ms. This improves the effective hash rate by 1 to 5% as nearly no GPU performance will be wasted on calculating outdated shares anymore. The drawback is a slightly lower displayed hashrate as while switching no hashes are calculated.
One of those days I swear I will put an option that simply does hashrate = hashrate * 1.10 like castxmr
Until then, if you don't know why pool needs stale shares, please learn what an orphaned block is and how mining works.
@fireice-uk orphaned block is most of the time the result of a valid solved block that did not propagate properly (bandwidth constraints, your node did not have many peers, etc). It could also be the result of a 51% attack where total nodes in the network are rejecting your valid blocks from the main chain and are solving faster their own blocks due to hash power which are getting accepted by >51% of the network.
A stale share could lead to solving two blocks at the same height, then the highest difficulty share would be taken as valid in most cases (depends on blockchain).
Could you enlighten us how mining works then? Because I think some of us are not seeing the connection of how stale shares are useful to the underlying block chain.
@vans163 I think your math is a bit drastic. Disregarding for a moment whether or not this option should be added... I just want to bound exactly what we are talking about.
As you mention in https://github.com/fireice-uk/xmr-stak/issues/835#issuecomment-370311854 the best case savings (assuming no lost switching time) would average to 1 second per occurrence (it is actually less because there is a switching time).
@borzaka provides a good datapoint in https://github.com/fireice-uk/xmr-stak/issues/835#issuecomment-361976592 that it occurs about 13 times in 7.5 hrs
@borzaka provides another datapoint in https://github.com/fireice-uk/xmr-stak/issues/835#issuecomment-362852777 that it occurs about 24 times in 13 hrs.
That works out as follows: 13 seconds lost mining / 27,000 seconds mining = 0.05% reduction in effective hash rate 24 seconds lost mining / 46,800 seconds mining = 0.05% reduction in effective hash rate
This performance discussion is about 5 hundredth of a percent (.0005). It is not wrong to have such performance discussions, but it is nothing near as significant as you mention above.
@vans163
Since nicehash's programming is too simple to allow something like (you need to store the old templates) that they just throw away the block at point 3.
The fact that cough some cough post nonsense like that:
Yes some pools are very fair and credit stale shares (screwing over honest miners at the expense of stale miners, this is why I don't mine to those pools), but the higher paying places do not, like NiceHash exactly.
Shows that perhaps marketing is more important than the numbers to some people
After mining ~16 hours, this is my result:
My config
config.txt:
amd.txt
cpu.txt
Basic information
xmr-stak/2.2.0/c4400d19/master/win/nvidia-amd-cpu/aeon-monero/20