nanopool / nanominer

Nanominer is a versatile tool for mining cryptocurrencies on GPUs and CPUs.
https://nanominer.org
639 stars 348 forks source link

DAG Not Always Generated on AMD #77

Closed jch9678 closed 3 years ago

jch9678 commented 4 years ago

I'm running into this problem on 290 and Vega 56 cards. Sometimes the DAG is not generated when the miner is started. And when it is generated, I don't know how to say this but it's like it loses the DAG. The miner will still run but gpu usage goes to 0. If I let it run it kind of generates it at some point and mining commences again until it loses it.

Grumpy-Dwarf commented 4 years ago

Does miner have this behavior with stock overclocking settings? Could yo please provide miner logs for loosing DAG and slow DAG generation?

jch9678 commented 4 years ago

Not overclocking, stock and sometimes underclocking as well as undervolting.

Here are the past 3 logs and the current running log of a Vega 56 Machine. It is saying a GPU is hung up but there aren't any signs of it hanging up. It will restart a few times and then the DAG will generate and it will mine. You can see that it has been restarted a lot.

log_2020-05-13_10-17-24.log log_2020-05-13_10-43-06.log log_2020-05-13_10-14-22.log log_2020-05-13_10-45-18.log

jch9678 commented 4 years ago

Here is a 290x machine that was running at stock settings. I turned off the watchdog on this one. It also shows the GPUs hung up. I don't know, maybe it's not the DAG generation.

log_2020-05-13_10-41-38.log

jch9678 commented 4 years ago

Maybe the DAG generation is too harsh, is it possible to slow it down, wasn't there something like that when mining ethash?

Grumpy-Dwarf commented 4 years ago

Looks like the issue is not DAG generation itself but sporadically hanging GPUs. Watchdog marks GPU as stalled if it has not finish some command in two minutes. Can be some hardware problem like risers or a single GPU which makes driver consume all available CPU resources. Does CPU usage look good?

jch9678 commented 4 years ago

CPU usage looks fine, it's low. This is happening across several mining rigs with hardware that has been good for years.

jch9678 commented 4 years ago

Which driver is used for testing?

Grumpy-Dwarf commented 4 years ago

Kawpow algorithm consumes much more power than Ethash. 985 W for 6 Vega 56 rig. Can be power supply issues. Does it work if only one device is enabled for mining? With a half of device? With all except one? (devices=0,1,2 options)

Grumpy-Dwarf commented 4 years ago

Please also check power cables of GPUs, some people report they are overheated and burnt.

Grumpy-Dwarf commented 4 years ago

That's what I mean: https://www.youtube.com/watch?v=9GuThuR4DMs

jch9678 commented 4 years ago

There's plenty of power, at most 3 Vega 56 are hooked up to 900W server power supply.

When I get back to my machines I'll double check but I don't think it's a physical problem. All my machines were mining Argon2 Chukwa with no problems which consumes more power than Kawpow, 165w vs 210w

jch9678 commented 4 years ago

Ha, Red Panda, I watch his videos but haven't seen this yet. Ah it came out today. He's using splitter cables, I would never.

jch9678 commented 4 years ago

He sets up his rigs wrong. He shouldn't use the same power supply to power the gpu's and risers.

My 290x rigs were mining scrypt back in the day and pushing 300W.

jch9678 commented 4 years ago

Not sure if this helps but was able to make the DAG not load without a GPU hung. This is on a machine with newly installed 20.4.2 drivers First I started 1.9.3, it loaded the DAG and started finding shares, I closed it out and started 1.9.2. This time the DAG would not get created. The reason I was doing this is because I noticed that 1.9.2 is faster by at least 1mh/s 1.9.3_log_2020-05-14_15-18-18.log 1.9.2_log_2020-05-14_15-19-32.log