Open mechanator opened 5 years ago
Specifically this applies to https://moneroocean.stream where the algo is switching on the fly from monero to randomx and rx-arq sometimes depending on profitability. As determined by the pool on moneroocean, as some algos are not merge mineable and require gpu settings to change on the fly with nvidia cards.
All gpu hashrate in random-x series algorithms are significant lower than cpu hashrate even with more electricity cost. So why must gpu?
Currently GPU support wouldn't be added to the agenda before v1. Unless there is any advantage.
As for algorithm rotation, it's easy for a pool to change a different daemon but hard for random-x miners to switch to a different rx variant. Different from cryptonight, random-x requires several seconds even minutes (when without largepage and JIT optimization) to initialize the dataset before mining, it's quite a loss on revenue which the pool wouldn't condiser.
Multi-rx-algo support is really on the agenda, but not aiming to for the algorithm rotation.
I replied to 8 issues on xmrig-nvidia to close them for lack of understanding on how cuda8 is needed for older cards. However, I submitted issue to randomX and xmrig-nvidia to fix the tuning on nvidia cards issues. The biggest issue is that you have to manually adjust threads/blocks for each algo you are mining, the default ones work, but you can tune 25% or more higher. However, the new edge case is that there is a pool like moneroocean that can rotate algos on the whim, so the miner application needs to adjust the threads /blocks settings optimally based on each card on the fly. Also, this affects RandomX mining on nvidia since it has the same damned problem. So I wrote thus: "The real issue is also that there are so many variants of the cards that have a differing numbers of cuda cores . Especially in the GTX realm of card with the same chip or less number of cores. The algorithm for tuning on the fly detection gets you within 80% of max hashrate. However, determining the number of threads/blocks changes with each algorithm mined. Coupled that if you are on a rotating algorithm mining pool, the static settings entered might not work if the miner is configured "algo": "auto" and "variant": "auto" in the confg.json file.
A good proposal for fixing this would be a lookup table to optimize the settings for each variant card with some headspace for threads, correct block settings. Reduce the number of threads to not bang into the VRAM limitation of each card assuming it's loaded in Windows which reserves some RAM for the base drivers. This adaptation memory allocation problem is moot on cards of 4GB or higher, but you can run into it on 1-3GB VRAM cards. The math to determine threads and blocks doesn't make much sense as documented from xmr-stak. You can't just sort the thread/blocks settings by architecture. No, that would be a simple 5-6 case statement. Its determined by the SMX count, the amount of free usable ram on the GPU, and some kind of MOD divisor based on the number of CUDA cores involved with the SMX allocation.
The advantage is that you can get 25% more hashrate when you manually tune the cards by approximation and incremental stepping up/down the threads/blocks. However that doesn't work if you are mining on a pool with possible rotating algos. Also if a coin changes algorithm settings , possibly from a fork, then you might not be optimally set to mine the highest hashrate(or crash) possible. "