Open semeion opened 5 years ago
I've never used that feature so I doubt anyone does (maybe windows?) Make a systemd service file instead like a normal background process in Linux, it's not much tougher than a bash script
Put this in /etc/systemd/system/xmrig-nvidia.service
[Unit]
Description=xmrig-nvidia
After=network-online.target
Wants=network-online.target
AssertFileNotEmpty=/opt/xmrig-nvidia/config.json
[Service]
Type=simple
Environment=GPU_FORCE_64BIT_PTR=1
Environment=GPU_MAX_HEAP_SIZE=100
Environment=GPU_USE_SYNC_OBJECTS=1
Environment=GPU_MAX_ALLOC_PERCENT=100
Environment=GPU_SINGLE_ALLOC_PERCENT=100
Environment=CUDA_DEVICE_ORDER=PCI_BUS_ID
SyslogIdentifier=xmrig-nvidia
WorkingDirectory=/opt/xmrig-nvidia
ExecStart=/opt/xmrig-nvidia/xmrig-nvidia
Restart=always
KillSignal=SIGQUIT
User=root
Group=root
Nice=19
LimitMEMLOCK=256M
[Install]
WantedBy=multi-user.target
And then systemctl daemon-reload
(only needed when editing/adding to systemd confs) and then systemctl enable xmrig-nvidia
(also only needed once) and then systemctl start xmrig-nvidia
and then to watch its log do journalctl -xafu xmrig-nvidia
It will fire up as soon as networking is up on reboot, and you should only need the journal command.
Obviously the example assumes you've put it in /opt/xmrig-nvidia/
and also have a config.json
there
If you put all the commandline arguments in the ExecStart
line then you have to daemon-reload every time you want to tweak something. With config.json you edit the json and then systemctl restart xmrig-nvidia
because all systemd knows is the json file exists and not its content.
Wow!
What are that environment variables? And what is LimitMEMLOCK=256M? Will that variables increase the performance in some way?
And, thank you for this nice answer.
Worked like a charm! Thanks!
BTW, do you know some tip to overclock the GFX 1050 Ti in linux? Every tutorial i have followed don´t work... Seems like the 1050 Ti don´t accept OC... idk...
The env stuff:
GPU_*
are probably only effective for OpenCL miners (and even then probably only AMD) but I tend to set them anyway (I copy this general skeleton out and modify to launch other miners, prefer it just has the env that works everywhere with everything)CUDA_DEVICE_ORDER=PCI_BUS_ID
forces CUDA to offer multiple gpus in their slot number order, otherwise CUDA will put the fastest card first (as 0) if you have mixed cards, whereas nvidia-smi and Xorg generally refer to them in slot-order (so this forces the index in miner apps, to be the same as the one when inspecting with nvidia-smi commands and clocking via Xorg with nvidia-settings, etc)
It would not have any effect if you have one GPU, or all of them are the same type (then they end up in PCI order anyway)To clock nvidia cards in Linux you must fire up Xorg. If you do not want a desktop you don't need one, you can just fire xinit and maybe an xterm and no window-manager to save wasted vram and cpu cycles (such as if GDM fired up but nobody could see it to login anyway). But, once you have Xorg going and also have it set to allow overclocking (google coolbits Xorg) then nvidia-settings will apply clocks. BUT - You will be forced into P2 mode in Linux however, and there is no known unlock to get P0 back. In Windows you can tweak the driver to let you use CUDA in P0 (which is usually better clocks, it depends on your cards particular bios, sometimes P2==P0 and you're OK). But also sometimes you can only set offsets for the P0 clocks and P2 clocking doesn't work, again is card and manufacturer dependent.
I have some MSI 1060 which suck at P2 so that rig is one of the only Windows ones, it gets 33% more hashrate when P0 is unlocked (and clocks can be set). But then these PNY 1060 act just like P0 even in P2 mode, so those do full speeds and can be clocked in Linux just fine. It's in however they (manufacturer) decided to setup the bios profiles for each mode.
nvidia-settings -c :0 -q all
generally dumps everything you might be able to know about a GPU. If you do have a screen hooked up (or redirect X through ssh to some other Xorg or even cygwin/x) then you can launch nvidia-settings with no args and get the GUI (and better check out what things your card might let you do)
This /etc/X11/xorg.conf
is an example of how I make Xorg fire up enough to clock, but not rob me of resources. I think I needed to install xterm (so that vanilla xinit would have something to launch and hold the session open) and also xorg-input-void (so that it doesn't even bother to bind USB, but Xorg won't launch without inputs, but we have no inputs, so this void module hacks that problem)
Section "ServerLayout"
Identifier "Layout0"
Screen 0 "Screen0" 0 0
Screen 1 "Screen1" RightOf "Screen0"
Screen 2 "Screen2" RightOf "Screen1"
InputDevice "Keyboard0" "CoreKeyboard"
InputDevice "Mouse0" "CorePointer"
Option "AutoAddDevices" "false"
Option "AutoEnableDevices" "false"
Option "AutoAddGPU" "false"
EndSection
Section "Files"
EndSection
Section "Module"
Load "glx"
Disable "evdev"
Disable "vesa"
Disable "fbdev"
Disable "modesetting"
Disable "ati"
Disable "amdgpu"
Disable "fglrx"
Disable "mga"
Disable "nouveau"
EndSection
Section "InputDevice"
Identifier "Mouse0"
Driver "void"
EndSection
Section "InputDevice"
Identifier "Keyboard0"
Driver "void"
EndSection
Section "Monitor"
Identifier "Monitor0"
VendorName "Unknown"
ModelName "Unknown"
HorizSync 28.0 - 33.0
VertRefresh 43.0 - 72.0
Option "DPMS"
EndSection
Section "Monitor"
Identifier "Monitor1"
VendorName "Unknown"
ModelName "Unknown"
HorizSync 28.0 - 33.0
VertRefresh 43.0 - 72.0
Option "DPMS"
EndSection
Section "Monitor"
Identifier "Monitor2"
VendorName "Unknown"
ModelName "Unknown"
HorizSync 28.0 - 33.0
VertRefresh 43.0 - 72.0
Option "DPMS"
EndSection
Section "Device"
Identifier "Device0"
Driver "nvidia"
VendorName "NVIDIA Corporation"
BoardName "GeForce GTX 770"
BusID "PCI:1:0:0"
EndSection
Section "Device"
Identifier "Device1"
Driver "nvidia"
VendorName "NVIDIA Corporation"
BoardName "GeForce GTX 770"
BusID "PCI:2:0:0"
EndSection
Section "Device"
Identifier "Device2"
Driver "nvidia"
VendorName "NVIDIA Corporation"
BoardName "GeForce GTX 970"
BusID "PCI:3:0:0"
EndSection
Section "Screen"
Identifier "Screen0"
Device "Device0"
Monitor "Monitor0"
DefaultDepth 24
Option "AllowEmptyInitialConfiguration" "True"
Option "Coolbits" "28"
Option "Accel" "False"
Option "NoLogo" "True"
Option "UseDisplayDevice" "none"
Option "Interactive" "False"
SubSection "Display"
Depth 24
Modes "640x480"
EndSubSection
EndSection
Section "Screen"
Identifier "Screen1"
Device "Device1"
Monitor "Monitor1"
DefaultDepth 24
Option "AllowEmptyInitialConfiguration" "True"
Option "Coolbits" "28"
Option "Accel" "False"
Option "NoLogo" "True"
Option "UseDisplayDevice" "none"
Option "Interactive" "False"
SubSection "Display"
Depth 24
Modes "640x480"
EndSubSection
EndSection
Section "Screen"
Identifier "Screen2"
Device "Device2"
Monitor "Monitor2"
DefaultDepth 24
Option "AllowEmptyInitialConfiguration" "True"
Option "Coolbits" "28"
Option "Accel" "False"
Option "NoLogo" "True"
Option "UseDisplayDevice" "none"
Option "Interactive" "False"
SubSection "Display"
Depth 24
Modes "640x480"
EndSubSection
EndSection
And then I use this /etc/systemd/system/xorg-headless.service
to launch a real stripped down Xorg:
[Unit]
Description=Headless Xorg Server (for GPU control)
[Service]
ExecStart=/usr/bin/xinit
[Install]
WantedBy=multi-user.target
And I either disable or remove gdm (or any other window manager, and its Xorg launching capability). Since all mine are headless anyway, ssh only.
Once all this is working (check /var/log/Xorg.0.log
or such for Xorg launch problems) then nvidia-settings should work as long as env DISPLAY=:0
and you use arg -c :0
Oh, thank you very much for the amazing explanation of the OC process.
Link bookmarked!
For sure someone more will read and learn too.
I also use commands such as these in a bash script for clocking:
nvidia-smi -i 0 -pm 1
nvidia-smi -i 0 -c 0
nvidia-smi -i 0 -pl 90
set persistent clocks (even if Xorg and all other client apps release the GPU, otherwise it will return to defaults) - this also uses a service that comes with the driver, nvidia-persistenced
set compute mode to default (in case something else set them weird)
set watts for power limiting (example, 90w) note this is different than the percent that Windows takes so people saying they run 70% or whatever would translate to 0.7 * your_card_default_watts
for the Linux setting to be equivalent. Mine are 120W default and so 90 is the same as windows 75%
nvidia-settings -c :0 -a [gpu:0]/GPUFanControlState=1
nvidia-settings -c :0 -a [fan:0]/GPUTargetFanSpeed=100
nvidia-settings -c :0 -a [gpu:0]/GPUPowerMizerMode=1
set fans to manually controlled (fixed speed), set speed to max, set "Performance" mode for PowerMizer rules
nvidia-settings -c :0 -a [gpu:0]/GPUGraphicsClockOffset[3]=48
nvidia-settings -c :0 -a [gpu:0]/GPUMemoryTransferRateOffset[3]=512
set P0 clock offsets, this is added to base clocks (which depend on your bios too) so to get the right offsets you should inspect the default clocks (nvidia-settings -c :0 -q GPUCurrentClockFreqs
) and then figure the offsets from there to desired actual clock speeds.
If you are stuck in P2 then you may need to either use [2]
for the array index in the above commands (to edit perf=1 which is P2 rather than perf=2 which is P0) or try it with no array brackets at all (some card types use the non-array notation)
Some cards block P2 clock offsetting, you can check which modes have editable clocks with nvidia-settings -c :0 -q GPUPerfModes
which gives each perf level and what its base clocks are and if they accept offsetting. The perf number defines each section and they are separated by ;
, perf=0 is P8 (sleep/idle) perf=1 is P2 (CUDA mode) and perf=2 is P0 (highest perf, plus always editable clocks)
Running nvidia-smi
while mining will tell you what P-mode you're in.
I am trying figure out all info to try do it, btw my actual (not OC) nvidia-smi
report P0:
nvidia-smi
Wed Jul 3 00:44:41 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.26 Driver Version: 430.26 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 105... Off | 00000000:01:00.0 Off | N/A |
| 36% 57C P0 N/A / 72W | 2116MiB / 4040MiB | 100% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 14478 C /usr/bin/xmrig-nvidia 2106MiB |
+-----------------------------------------------------------------------------+
I don´t have X11 up, so, that command don´t work right now, but i can run it:
nvidia-smi -q -i 0 -d CLOCK
==============NVSMI LOG==============
Timestamp : Wed Jul 3 00:57:31 2019
Driver Version : 430.26
CUDA Version : 10.2
Attached GPUs : 1
GPU 00000000:01:00.0
Clocks
Graphics : 1733 MHz
SM : 1733 MHz
Memory : 3504 MHz
Video : 1556 MHz
Applications Clocks
Graphics : N/A
Memory : N/A
Default Applications Clocks
Graphics : N/A
Memory : N/A
Max Clocks
Graphics : 1936 MHz
SM : 1936 MHz
Memory : 3504 MHz
Video : 1708 MHz
Max Customer Boost Clocks
Graphics : N/A
SM Clock Samples
Duration : 294.85 sec
Number of Samples : 24
Max : 1746 MHz
Min : 139 MHz
Avg : 1725 MHz
Memory Clock Samples
Duration : 295.27 sec
Number of Samples : 24
Max : 3504 MHz
Min : 405 MHz
Avg : 3498 MHz
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Yeah, see how it won't allow any applications clocks, that's because it's not a compute-only GPU (Tesla pro expensive models with no LCD connections at all) There was a flash hack for GTX970 that made it look like the equivalent Tesla card and then all that works, without Xorg (and I think it killed all the video outputs, too). But I don't think there are any similar mods for any Pascal based cards due to signed flash and other problems/locks.
But for consumer cards there is no other way than Xorg working for nvidia-settings to be able to set clocks. And even then if you can't get P0 you might not be allowed to clock.
And you have to setup Xorg and make nvidia-settings work, to even find out if your cards bios will allow P2 clock editing, by dumping GPUPerfModes
setting:
Attribute 'GPUPerfModes' (tpad:0[gpu:0]): perf=0, nvclock=135, nvclockmin=135, nvclockmax=405, nvclockeditable=0, memclock=405, memclockmin=405, memclockmax=405,
memclockeditable=0, memTransferRate=810, memTransferRatemin=810, memTransferRatemax=810, memTransferRateeditable=0 ; perf=1, nvclock=135, nvclockmin=135, nvclockmax=840,
nvclockeditable=0, memclock=800, memclockmin=800, memclockmax=800, memclockeditable=0, memTransferRate=1600, memTransferRatemin=1600, memTransferRatemax=1600,
memTransferRateeditable=0 ; perf=2, nvclock=135, nvclockmin=135, nvclockmax=840, nvclockeditable=1, memclock=1733, memclockmin=1733, memclockmax=1733, memclockeditable=1,
memTransferRate=3466, memTransferRatemin=3466, memTransferRatemax=3466, memTransferRateeditable=1
This breaks down into this when formatted better:
perf=0, nvclock=135, nvclockmin=135, nvclockmax=405, nvclockeditable=0, memclock=405, memclockmin=405, memclockmax=405, memclockeditable=0, memTransferRate=810, memTransferRatemin=810, memTransferRatemax=810, memTransferRateeditable=0
perf=1, nvclock=135, nvclockmin=135, nvclockmax=840, nvclockeditable=0, memclock=800, memclockmin=800, memclockmax=800, memclockeditable=0, memTransferRate=1600, memTransferRatemin=1600, memTransferRatemax=1600, memTransferRateeditable=0
perf=2, nvclock=135, nvclockmin=135, nvclockmax=840, nvclockeditable=1, memclock=1733, memclockmin=1733, memclockmax=1733, memclockeditable=1, memTransferRate=3466, memTransferRatemin=3466, memTransferRatemax=3466, memTransferRateeditable=1
But the important bits are nvclockeditable
and memclockeditable
if they say =0
then you can't clock that P-mode.
perf=0 is P8 perf=1 is P2 perf=2 is P0
Thus where it only has editable=1
in the perf=2 line, means I can only clock this GPU in P0 (under Linux) Note how the perf=1 line has all editable=0 which means locked.
Unfortunate that you have to get Xorg all set (go through hassle) just to find out you probably can't clock P2 (unless the manufacturer actually took some time to tweak their bios, like PNY did on 1060 6GB dual-fan cards). P2 being nonclockable is the nvidia default so the manufacturer has to bother to "fix" that issue, most don't.
Windows also lets you clock P2 somehow (app has an "unlock min/max" button) and I do not know of any equivalent of that either, in Linux. But then if you've got P2 clocked up to the sky, when you exit miner and the GPU flips up to P0 momentarily it will crash/lock the system due to the offsets from P2 being too high as P0 offsets (instant GPU freeze).
Best option is windows so you can disable the P2 lock and get P0 for real, then clock as normal.
Also note this example (Quadro K1100M) is before Pascal and runs P0 in Linux just fine with no changes. So I can clock this one, no problems, but it was before the whole P2-for-compute locking idea and isn't a Pascal core.
It may become unclockable and lock P2 if I ran a newer driver, I'm intentionally on 390.116 and I think I may have had problems clocking on the newer drivers. I don't think I tested the oldest possible Pascal-supporting driver version (on a Pascal) to see if it may not have the lock in it yet. You could try 390.116 and see if it computes in P0 or P2... but I am pretty sure Pascal chips were P2-lock forever.
what command did you used to dump GPUPerfModes
setting?
nvidia-settings -c :0 -q GPUPerfModes
again, can't be done until you make Xorg work, so that nvidia-settings works (even from commandline)
However I did just notice you are in P0 so if you do setup Xorg it should actually be clockable
nvidia-xconfig --allow-empty-initial-configuration --enable-all-gpus --cool-bits=28 --separate-x-screens
if you run that, it should build you a basic /etc/X11/xorg.conf
without the memory savings of forcing 640x480 and turning off accelerations, but it will work fine.
I trim the Xorg a lot so that 1GB or even 512MB cards still have room for mining jobs, but it's not needed really if you have 4GB (note it only uses a little over half for full mining speed anyway, as-is)
oh and I live life as root
therefore half this stuff might need sudo
(I wouldn't know)
I can´t reboot my PC right now, but i will do it on next reboot and put everythings you told me in pratice! a lot of info!
Thank you very much! If iget it working (or not) i will post here :+1:
However of note, when RandomX comes in soon, you might need to trim since you will need 2080MB plus the scratchpads which will use a little over 75% of your total 4GB (you will need to recover some probably)
It can be editable, but when i change the clock offset to +200, it change in fact, but the xmrig lost like 10 H/s, what is weird...
Another thing is, seems like it have and "Adaptive Clocking: enabled", maybe it is working together with the OC and making the things worse, idk...
Attribute 'GPUPerfModes' (blackbird:0.1): perf=0, nvclock=139, nvclockmin=139, nvclockmax=607, nvclockeditable=1, memclock=405, memclockmin=405, memclockmax=405, memclockeditable=1, memTransferRate=810, memTransferRatemin=810,
memTransferRatemax=810, memTransferRateeditable=1 ; perf=1, nvclock=139, nvclockmin=139, nvclockmax=1911, nvclockeditable=1, memclock=810, memclockmin=810, memclockmax=810, memclockeditable=1, memTransferRate=1620,
memTransferRatemin=1620, memTransferRatemax=1620, memTransferRateeditable=1 ; perf=2, nvclock=164, nvclockmin=164, nvclockmax=1936, nvclockeditable=1, memclock=3504, memclockmin=3504, memclockmax=3504, memclockeditable=1,
memTransferRate=7008, memTransferRatemin=7008, memTransferRatemax=7008, memTransferRateeditable=1
Attribute 'GPUPerfModes' (blackbird:0[gpu:0]): perf=0, nvclock=139, nvclockmin=139, nvclockmax=607, nvclockeditable=1, memclock=405, memclockmin=405, memclockmax=405, memclockeditable=1, memTransferRate=810, memTransferRatemin=810,
memTransferRatemax=810, memTransferRateeditable=1 ; perf=1, nvclock=139, nvclockmin=139, nvclockmax=1911, nvclockeditable=1, memclock=810, memclockmin=810, memclockmax=810, memclockeditable=1, memTransferRate=1620,
memTransferRatemin=1620, memTransferRatemax=1620, memTransferRateeditable=1 ; perf=2, nvclock=164, nvclockmin=164, nvclockmax=1936, nvclockeditable=1, memclock=3504, memclockmin=3504, memclockmax=3504, memclockeditable=1,
memTransferRate=7008, memTransferRatemin=7008, memTransferRatemax=7008, memTransferRateeditable=1
About your /etc/systemd/system/xmrig-nvidia.service
, why are using Nice=19
(Low priority)?
Using Nice=-20
(high priority) will increase the hashrate?
When i set -B or --background the xmrig-nvidia return: "WARNING: NVIDIA GPU 0: cannot be selected."
I am using Arch Linux with 1050Ti. xmrig-nvidia v2.14.4.
The xmrig CPU version work fine.
How to fix it?