fireice-uk / xmr-stak

Free Monero RandomX Miner and unified CryptoNight miner
GNU General Public License v3.0
4.05k stars 1.79k forks source link

Non mining CPU High Usage cause from MSI-Afterburner #2075

Open CryptoWedge opened 5 years ago

CryptoWedge commented 5 years ago

Not sure whats going on here. It only happens in XMR-Stak. I added a 5th GPU to my rig yesterday and everything runs fine. Other miners behave fine, work well such as cryptodredge doing various tests and monitoring. But with XMR-Stak I get high CPU usage, While NOT MINING on my CPU. I open up performance monitor and watch whats using the CPU and its MSI afterburner and Deferred Procedure Calls and Interrupt Service Routines in the 1st and 2nd highest use positions.

To fix the situation and keep processor usage between 15%-30% with average use of 22% (ok), I have to disable live monitoring in MSI afterburner by right clicking in the graph/charts panel and click Pause monitoring. As soon as I do this it drops the heavy processor usage, but I still get laggy performance clicking around in MSI afterburner.

Compared to using crypto dredge with same clocks and everything and with LIVE MONITORING enabled I get no weird effects like above when mining on cryptonight algos via cryptodredge miner.

Any ideas?

psychocrypt commented 5 years ago

Which OS? Which xmr-stak version? Post the full output from the start of the miner (first 20 sec)

CryptoWedge commented 5 years ago

Windows 7 64 bit, XMR-stak 2.5.2 version. I tried the 2.6.0 version and it does it as well.

What are you looking for in the first 20 seconds of the miner start? I can describe off the top of my head what it does since im not near miner at first.

Im using cuda 9.1 so first couple lines is trying cuda 10/9.2 until it realizes I can use 9.1 then pickups all my cards. Says AMD platform index id = 0 found (my processor im assuming) Then says CL_DEVICE_NOT_FOUND when calling clGetDeviceIDs

Then Warnings for No AMD device found, and AMD and CPU backends disabled.

psychocrypt commented 5 years ago

I need the exact output each line is important

CryptoWedge commented 5 years ago

logtest.txt

Ok got a log file posted check it out and let me know.

CryptoWedge commented 5 years ago

Not sure what I did or changed, really nothing but this problem went away temporarily. CPU usage was normal/low and MSI afterburner worked when actively monitoring. There was no lag, trying to recall if anything I did recently in the last 24 hours that would have changed this. However since then I restarted my machine and the issue still persists now.

Also forgot to note #3 with MSI afterburner on, or number 1 on resources use for CPU (with MSI aftburner monitoring disabled) the highest usage is the NT Kernel and System in my process list for CPU usage.

Spudz76 commented 5 years ago

Try NvidiaInspector and the updated NvidiaProfileInspector

Uninstall that AfterBurner app, it is trash, I tried to use it for months before I decided it is not written very well. Especially on nVidia (it works "ok" for AMD sometimes). This inspector app does everything AB does and more (the Profile inspector part is useful, if you run 10xx series, which will be locked to P2 mode otherwise)

Also try the various possibilities for sync_mode I had an issue with ethminer where unless I set it to use spinlock it had high CPU usage (just mining, not even mining+monitoring). Even though that should be the worst CPU usage, who knows. Yield was giving me serious choke, as was blocking-sync.

Eventually though, I gave up on monitoring, the way nVidia does their performance metrics is somewhat clunky / designed 'wrong', and tends not to work well when the GPUs are pegged at 100%. And if you crash the NVML (metrics) engine the whole driver goes down with it. Are the metrics really that important to record?

You could also reduce the update delay.

CryptoWedge commented 5 years ago

Try NvidiaInspector and the updated NvidiaProfileInspector

Uninstall that AfterBurner app, it is trash, I tried to use it for months before I decided it is not written very well. Especially on nVidia (it works "ok" for AMD sometimes). This inspector app does everything AB does and more (the Profile inspector part is useful, if you run 10xx series, which will be locked to P2 mode otherwise)

Also try the various possibilities for sync_mode I had an issue with ethminer where unless I set it to use spinlock it had high CPU usage (just mining, not even mining+monitoring). Even though that should be the worst CPU usage, who knows. Yield was giving me serious choke, as was blocking-sync.

Eventually though, I gave up on monitoring, the way nVidia does their performance metrics is somewhat clunky / designed 'wrong', and tends not to work well when the GPUs are pegged at 100%. And if you crash the NVML (metrics) engine the whole driver goes down with it. Are the metrics really that important to record?

You could also reduce the update delay.

I did reduce the update delay on MSI afterburner, it helped a little bit, maybe I could try delaying it longer. It still acts laggy though after I pause monitoring a bit.

I tried messing with the sync_mode it yielded all options higher CPU usage in each case.

Im starting to think its just something related to the system. After I added a 5th card it started doing this only in XMR-stak though. And like I said the other day all of a sudden it started working like it use to with NO LAG when I was bouncing around between different mining programs/coins that day. Didnt change any settings other than as usual bouncing back and forth clocks and mining programs respectively. Then after I shutdown/restarted it was back normal laggy.

All other mining programs work fine and have zero issue when active monitoring is enabled for MSI afterburner. So its something tied to xmr-stak and I think my system in conjunction.

Spudz76 commented 5 years ago

Any risers? sometimes making sure every slot (even mobo) is set to 1x PCI lane (vs 8x or 4x or 16x) makes the timing work out better (all on same "communication clock" then) If a couple respond quickly and a couple have "1x lag" in comparison it may force the NVML/CUDA API requests to queue/wait/lag etc as a side effect.

I had similar issues when I was underpowered / bad wiring / bad connectors / bad slot-riser board, one GPU would usually fall off the bus (a message Linux gives, in windows it would probably yellow-exclaim the device in devmgr and only offer the other 4 GPUs until reboot) but before that it would act laggy like you describe. Tough to debug. Perhaps though, the 5th is over the edge of your PSU or wiring or etc. Keep only two slot-risers per whip from PSU, when I went three it started melting cables (at the PSU end / because modular / fixed by hard wiring whips to the internal PSU rails bypassing the modular design). Two even made the modular connectors look a bit cooked after while although not melt-hot.

You could quick-test somewhat by powercapping, set the GPUs to 66% limit so the five draw around what the four used to, then if its stable (of course slower) then audit power supply/cabling/eliminate as many adaptors... also don't use SATA-to-Molex adapters they are not really capable of 70-80w constant draw, just based on their contact style (face-to-face not spring ringlet and pin or etc).

Spudz76 commented 5 years ago

Then again if other miner apps work it makes the power stuff somewhat/possibly not it. Unless it's a very edge case where the technique used (exact kernel code) pushes it over the edge.

CryptoWedge commented 5 years ago

Yeah without going TL:DR Ive checked all those options and monitored for possible power issues and all my connectors are feel-able(im able to stick my hand down in between cards) and no wires or connectors on the riser themselves are hot or abnormal.

Even on other algos that draw more power and I run a bit higher intensity my desktop is of course laggy but no abnormal CPU usage or super lag selecting stuff in MSI like in XMR-Stak. Ive been running cryptodredge for CNote variant algos (just better speeds on NON-CNV8 algos xmr-stak is better hence why I still use it) and no symptoms of lag or high CPU-usage on that miner software.

It runs fine i ran it off-on last couple days and it does work OK. I wish I knew what happened when all the symptoms suddenly disappeared. Like I said I didnt change anything in that instance other than hopping between mining different coins then like I noticed HEY the lags gone?!?! Then as said when I restarted it came back.