fireice-uk / xmr-stak

Free Monero RandomX Miner and unified CryptoNight miner
GNU General Public License v3.0
4.05k stars 1.79k forks source link

VEGA VII system shutdowns within 30 seconds of running xmr stack #2244

Open m03e5 opened 5 years ago

m03e5 commented 5 years ago

Brand new RADEON VEGA VII (Powercolor) Latest drivers available as of today from AMD website. WINDOWS 10 X64 HOME all updates done. Miner detects GPU properly and starts mining with speed approximately 1850hs on XMRV8 then system goes instant reboot within 30 seconds of running miner. Some people say they get 2500-2700hs on stock settings. I have no idea how they get that. System itself is running fine, mainboard, ram, psu are all good. I have ran two VEGA64 off this PC without any problems. CLEAN INSTALL OS. No vega registry edits, no overclocking, all stock, default settings. I have not yet tried block chain driver and not sure if it even work with this card, but will try if it will work then perhaps i will try registry edits, similar to 1 Gen VEGA.

Basic information

Compile issues

Issue with the execution

Spudz76 commented 5 years ago

No reason for that effect unless PSU is not enough, or worn out. Risers or direct mobo? Any adaptations in wiring, or real feeds direct from high quality PSU? Watch out for melted connections at the PSU if its modular (usually the PSU end is not actually capable of 24/7 150w per whip without heating up). Better to use non-modular supplies at least for the PCI-e whips.

Label spec was for day1, could be 66% of that depending on quality and usage and heat. I have had several PSUs that work fine until one day when they decide the same old watts are too tough. Had one that I used for months but I couldn't fire up any games, would reboot within 5 minutes as the PSU got hot and even worse at supplying watts required (well under label, probably 50%).

Even had some cheap ones that could barely cope with label draw on day1, heat overload tripped protection within the same half hour type timeframe.

Spudz76 commented 5 years ago

Oh, and on the hashrate 1850 is about right for cn2v2 ("cryptonight_v8"/"monero" currencies) as the PoW is more difficult.

Those other higher numbers were from before the fork most likely (watch out for dates on posts!) as the old cn2v1 ("cryptonight_v7") was that speed on Vega (slightly less on the 56).

I think someone got a bit more out of one by undervolting and all that (keep it from hitting thermal or watt limiter).

xq0404 commented 5 years ago

I have just ordered Radeon VII in the hopes of achieving up to 2800 H/s. It seems to be more like a dream. Have you installed AMD's latest driver (win10-64bit-radeon-software-adrenalin-2019-edition-19.2.2-feb12)?

m03e5 commented 5 years ago

@xq0404

psychocrypt commented 5 years ago

it can still be the psu because vii is pulling over 300watt in the worst case. Which powerplug do you use?

psychocrypt commented 5 years ago

I mean wich psu do you use?

m03e5 commented 5 years ago

@psychocrypt

psychocrypt commented 5 years ago

mhh the point is that a program should be not able to crash the system if it is running as user.if so than something else is broken.

m03e5 commented 5 years ago

I have ran 2x VEGA64 off this Corsair TX850 without any problems. Same system. So its certainly not PSU or any other hardware. I am running FURMARK burn in test on VEGA VII in this system, been on for some time now and no reset. VEGA Fans are - 100% temp is ~76'c and junction is about 113'c, runs ok so far....

wanginator commented 5 years ago

@xq0404 Should be doable. I got 2600h/s on mine with the launch day driver set with no tweaking.

m03e5 commented 5 years ago

@wanginator

Graphics Engine | AXVII 16GBHBM2-3DH Video Memory | 16GB HBM2 Stream Processor | 3840 Units Engine Clock | 1400 MHz (Boost Clock: 1750 Mhz) Memory Clock | 1GHz (2.0Gbps) Memory Interface | 4096bit DirectX® Support | 12 Bus Standard | PCIE 3.0

It looks like all stores are sold out ...ebay has vega vii for over 900$(wow), i paid 750$ with taxes, local store had a bunch yesterday and none today - there must be something to it.

Perhaps radoen pro vii driver does something to hash...(https://www.reddit.com/r/Amd/comments/aps02f/amds_clarification_regarding_pro_drivers_support/) i was not able to locate it on AMD website though... the only driver is either consumer level vega 2nd gen or radeon frontier... Has anyone seen it in the wild? tried it?

wanginator commented 5 years ago

@m03e5 I just used my vega config. 2 threads at 1932 each. As for specs, all these cards are exactly identical.

m03e5 commented 5 years ago

@wanginator

wanginator commented 5 years ago

@m03e5 yes, same stak config as 56/64. No tweaks, powerplay tables, etc. I literally copied my stak folder off of a vega rig, removed all but 2 threads and started stak just to see what it would do.

xq0404 commented 5 years ago

@xq0404 Should be doable. I got 2600h/s on mine with the launch day driver set with no tweaking. vii Thank you! Good news!

xq0404 commented 5 years ago

@xq0404

* this is the exact driver i have installed (win10-64bit-radeon-software-adrenalin-2019-edition-19.2.2-feb12)
  @Spudz76 i know it s ounds like its PSU or what you have mentioned. Thats the first thought i had, i got both VEGA64's and original drive back to this pc and they run fine, so its not PSU. Cards go directly to board PCIE slots, no raisers or any adapters. PSU is not modular either. In terms of of hashing power 1850hs - its exact same as i get per VEGA64 on stack or cast miners. This is something i did not expect due to the fact that VEGA VII gpu core runs about 200+Mhz higher frequency AND VEGA VII has 16GB of HBM2 memory which makes me think it should get at least 2000hs on moneroV8. I think i have found problem, need to do a few tests to confirm, as of right now it looks like it may not be related to miner... will update. By the way here is videos how it does restart:
  xmr stack: https://www.youtube.com/watch?v=FrwUAdloXZg
  xmrig: https://www.youtube.com/watch?v=aDkurzmju8o

Suggested steps:

  1. It seems you have another motherboard GPU Interl HD Graphics 510. I recommend disabling it;
  2. upgrading to xmr-stak-win64-2.8.3 and rerunning the xmr-stak.exe to automatically reconfigure amd.tex, config.txt, cpu.txt ;
  3. reducing fan speed and noise level by sliding Power Limit to -40% and Min Acoustic Limit to 910.
  4. manually setting Max Temperature to 74-79 degrees Celsius. rx480 driver By the way, you don't seem to have GPU Workload option under Global Graphics setting. It needs to be set at "Compute".
MohitYadavGitHub commented 5 years ago

@xq0404 Should be doable. I got 2600h/s on mine with the launch day driver set with no tweaking.

Which memory manufacturer do have for the hbm? (Hynix , Micron, Samsung?)

I am getting 2400 with mine . Not sure if its the config or the card itself.

ALso a lot of rejected shares !

raretechsea commented 5 years ago

I wonder if the gpu heat-pad is not torqued properly?

On Sun, Feb 17, 2019, 7:28 AM MohitYadavGitHub <notifications@github.com wrote:

@xq0404 https://github.com/xq0404 Should be doable. I got 2600h/s on mine with the launch day driver set with no tweaking.

Which memory manufacturer do have for the hbm? (Hynix , Micron, Samsung?)

I am getting 2400 with mine . Not sure if its the config or the card itself.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/fireice-uk/xmr-stak/issues/2244#issuecomment-464402641, or mute the thread https://github.com/notifications/unsubscribe-auth/Amd_2L27JuzQyeuvRSwyMp12xHUxo5Exks5vOKItgaJpZM4a80Ny .

xq0404 commented 5 years ago

Which memory manufacturer do have for the hbm? (Hynix , Micron, Samsung?)

I am getting 2400 with mine . Not sure if its the config or the card itself.

ALso a lot of rejected shares !

That's the minimum theoretically for XMR mining if VII gets 90MH/s for Ethereum.

m03e5 commented 5 years ago

@xq0404 1 - i dont think it has any effect on vega performance... do you mean disabling it in device manager? Its cpu built in gfx, i dont think there is a way to disable it, except device manager. 2 - did this, no change, actually hasrate dropped by approx 75-90hs, so reverted back to 2.8.2 3 - noise/fan speed has nothing to do with performance AFAIK, it does not bother me. 4 - a bit of noise but more stable performance is more important. see #3 5 - you don't seem to have GPU Workload option under Global Graphics setting. It needs to be set at "Compute". - I have no idea where that setting is, checked everywhere it is not there... i have checked VEGA64 systems with blockchain drivers - that option is not there either. Can you do screenshot of it and what driver version/card brand do you use? I have a feeling this has something to do with performance. P.S. Vega VII does 90HS on ETH, i can confirm this. (this card does 87-88Hs)

xq0404 commented 5 years ago

@xq0404 1 - i dont think it has any effect on vega performance... do you mean disabling it in device manager? Its cpu built in gfx, i dont think there is a way to disable it, except device manager. 2 - did this, no change, actually hasrate dropped by approx 75-90hs, so reverted back to 2.8.2 3 - noise/fan speed has nothing to do with performance AFAIK, it does not bother me. 4 - a bit of noise but more stable performance is more important. see #3 5 - you don't seem to have GPU Workload option under Global Graphics setting. It needs to be set at "Compute". - I have no idea where that setting is, checked everywhere it is not there... i have checked VEGA64 systems with blockchain drivers - that option is not there either. Can you do screenshot of it and what driver version/card brand do you use? I have a feeling this has something to do with performance. P.S. Vega VII does 90HS on ETH, i can confirm this. (this card does 87-88Hs)

Yes, only devince manger can disable Interl HD Graphics 510. "GPU Workload" option" is under Global Graphics setting, perhaps it's because mine is an RX 480 GPU. At present, mining ethereum is way more profitiable than mining Monero. Actually while you mine Ethereum with Nanominer, you can simultaneously mine Monero with xmr-stak (by disabling GPU mining in the amd.txt and using CPU only).

compute

m03e5 commented 5 years ago

So after some tinkering with settings and overclock i was able to reach 2007hs peak on 2.8.3 stack.. vega vii 2007hs

psychocrypt commented 5 years ago

most user using single thread for cn-gpu. Try if you see maybe the same hashrate with one thread

m03e5 commented 5 years ago

I tried running miner in 1 thread - same speed. Not sure what the problem is... I did manage to get 2700-2800hs on cast. The fastest rate i have seen is 2910hs which still less than other people report (3100-3200hs on cnv8) and card runs a bit cooler, on lower freqs, which is a plus. vega 7 2700

psychocrypt commented 5 years ago

now I am confused. You posted a Screenshot from the xmrig miner not xmr-stak. Is this still related to the crash after 30 sec? Do not mix issues. If this is now a performance releated issues please ooen a new issue.

m03e5 commented 5 years ago

@psychocrypt

psychocrypt commented 5 years ago

thx for the info. hope you will get a new card soon

fractalyse commented 5 years ago

I'm experiencing similar issue on ubuntu 18.04 with a Radeon VII. I'm running the card @ 1135Mhz with a power draw of 80w, and my computer randomly reboot, card is running at 60°C. I've tried drivers 18.50 and 19.10 same behavior.

psychocrypt commented 5 years ago

The miner is not able to reboot your OS. Even if there would be an issue the miner should never be able to crash your system since it run in user mode. It sounds like issue with the power supply. Do you connected all power plugs to the card?

fractalyse commented 5 years ago

All cable are connected, I'm using EVGA 1000w T2. It's not the PSU I think, I was running 2x RX580 before switching to Radon VII there were pulling more than this card.

This is related to compute job, If I run Luxmark, computer shutdown immediately. Clearly does not depend how much power card is drawing. This only happens with openCL related compute

EDIT: Probably find the issue, lot of dust in PSU, clean up, no issue since 2hours

EDIT BIS: Not the PSU the faulty, crash still occurs randomly, after few seconds or severals days..