Ricks-Lab / gpu-utils

A set of utilities for monitoring and customizing GPU performance
GNU General Public License v3.0
137 stars 23 forks source link

Auto-run pac .sh scripts on startup #23

Closed csecht closed 3 years ago

csecht commented 5 years ago

I got my optimized settings in amdgpu-pac .sh scripts to auto-run on startup by using crontab with @reboot. I edited the scripts (created with amdgpu-pac --force) to create a log file on my desktop so I'll know if my system has rebooted, like when there is a power glitch when I'm away. To have the pac scripts for both GPUs execute before the boinc-client auto-starts and puts a load on the GPUs, I included a 30 option in cc_config.xml. The 30 sec delay is more than ample time on my system. I wanted the last script (of the two) that's executed at boot to launch amdgpu-monitor, but it didn't work. I put this on the last line of the .sh pac file: /usr/bin/python3.6 /home/craig/Desktop/amdgpu-utils-2.5.1/amdgpu-monitor This command line works from the Terminal to launch the monitor, but not in the .sh script that crontab runs. Any ideas? Do I need a 'sudo' in there somewhere? Is there a better way to do this?

Ricks-Lab commented 5 years ago

Do you have details of the error on execution with your current approach? I suspect that it may be related to not being ran in a terminal. Is it possible to open a terminal with the executable as an argument?

csecht commented 5 years ago

Thanks much for the feedback. You were right; I got it to go with this:

# open amdgpu-monitor on desktop gnome-terminal --tab --title="amdgpu-monitor" -- sh -c 'cd /home/craig/Desktop; amdgpu-utils-2.5.1/amdgpu-monitor'

I'd like to generalize this so it doesn't depend on the version of the amdgpu-utils directory, but I'm happy with it for now. Do you know of any problems letting amdgpu-monitor run in a terminal window for a long time (days)?

csecht commented 5 years ago

Okay, I got the directory version control sorted by using a symlink instead of a hard path to the current (or desired) amdgpu-util directory (and thus was able to move various amdgpu-util directories off my Desktop). Simple housekeeping, but I'm still learning this Linux stuff, lol.

Ricks-Lab commented 5 years ago

@csecht There should be no issues in running amdgpu-monitor over a long period of time. I have only seen issues running amdgpu-plot for more than a few hours.

It would be great if you could document your process in the user guide. I am sure users will benefit from your experience.

csecht commented 5 years ago

Thanks, good idea. I’ll work on that once I fix a bug; that terminal command to evoke -monitor doesn’t seem to run from the cron job that auto-runs the PAC settings script (I had only tested it from a desktop .sh script).

On May 30, 2019, at 8:01 AM, Rick notifications@github.com wrote:

@csecht https://github.com/csecht There should be no issues in running amdgpu-monitor over a long period of time. I have only seen issues running amdgpu-plot for more than a few hours.

It would be great if you could document your process in the user guide. I am sure users will benefit from your experience.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Ricks-Lab/amdgpu-utils/issues/23?email_source=notifications&email_token=ALMVCQRC7I3PCFUHDEH7BCTPX7F2JA5CNFSM4HNYQX42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWSIIKQ#issuecomment-497321002, or mute the thread https://github.com/notifications/unsubscribe-auth/ALMVCQURRJL6WGJVF2YVXHTPX7F2JANCNFSM4HNYQX4Q.

csecht commented 5 years ago

While I have been able to autostart amdgpu-monitor --gui using Gnome's Startup Application, I haven't been able to run it from a cron job. Nor can I figure out how to get the default text version amdgpu-monitor to autostart in a terminal with either Startup Application or cron. In the Preferences window of the Startup Application, I entered this command: /usr/bin/python3 /home/craig/Desktop/amdgpu-utils/amdgpu-monitor --gui which generated this autostart configuration file:

~$ cat /home/craig/.config/autostart/python3.desktop
[Desktop Entry]
Type=Application
Exec=/usr/bin/python3 /home/craig/Desktop/amdgpu-utils/amdgpu-monitor --gui
Hidden=false
NoDisplay=false
X-GNOME-Autostart-enabled=true
Name[en_US]=amdgpu-monitor
Name=amdgpu-monitor graphic
Comment[en_US]=To monitor GPUs auto-running BOINC tasks
Comment=To monitor GPUs auto-running BOINC tasks

and on reboot the amdgpu-monitor Gtk window is there.

But that Gnome approach isn't going to be generally applicable for Linux users, so I thought it would be good to work out a crontab approach. No luck yet though.

I was able to get this test crontab to work (without the --gui argument), but only if I open a new terminal window once the cron job is running: * * * * * /home/craig/Desktop/amdgpu-utils/amdgpu-monitor >/dev/pts/3 Once running, however, upon each timed update (2 sec default) it appends a new monitor table to the terminal window instead of replacing the table, so the terminal window keeps filling up. The main problem is that the redirect to my display, /dev/pts/3, which I got from ~$ tty, is not a static index, so that's not a good approach in any event. For example, yesterday my tty was /dev/pts/2.

Though I had luck with the Startup Application, I haven't figured out how to get cron to run the --gui option.

All these work as desktop terminal commands, but not as crontab commands: /home/craig/Desktop/amdgpu-utils/amdgpu-monitor --gui or this: export DISPLAY=:0 && /home/craig/Desktop/amdgpu-utils/amdgpu-monitor --gui or this: /home/craig/Desktop/amdgpu-utils/amdgpu-monitor --gui && >/dev/pts/3 Providing the full path to /usr/bin/python3 doesn't help. (Using >/dev/pts/3 generates bash: /dev/pts/3: Permission denied, but the monitor Gtx window still launches.)

I've tried evoking gnome-terminal in the crontab to get the text amdgpu-monitor to run, based on various on-line discussions, but no luck. For example this opens a new terminal window when executed from a desktop terminal, but not as a crontab command: /usr/bin/gnome-terminal --working-directory=/home/craig/ --title="amdgpu-monitor" -- sh -c '/usr/bin/python3 /home/craig/Desktop/amdgpu-utils/amdgpu-monitor; $SHELL' && export DISPLAY=":0"

I'll work on adding to the User Guide a scheme to run amdgpu-pac .sh scripts on reboot and for the limited use of a amdgpu-monitor --gui auto-startup on Gnome systems.

csecht commented 5 years ago

I've noticed that using the Ubuntu Startup Applications utility to auto-start amdgpu-monitor --gui executes the graphic monitor with no Terminal being launched, which makes for a tidy Desktop window. Is there a way to launch amdgpu-monitor --gui just anytime without using a terminal?

Ricks-Lab commented 5 years ago

I created a file named GPU-mon.desktop in the Desktop directory with the following contents:

[Desktop Entry]
Version=1.0
Name=amdgpu-utils Monitor
Comment=My Application Comment :-)
Exec=/home/rick/PyDev/amdgpu-utils/amdgpu-monitor --gui --sleep 4
Icon=/home/rick/PyDev/amdgpu-utils/icons/amdgpu-monitor.icon.png
Path=/home/rick/PyDev/amdgpu-utils/
Terminal=false
Type=Application
Categories=Utility;Application;
Comment[en_US.UTF-8]=
GenericName[en_US.UTF-8]=amdgpu-utils Monitor
Name[en_US]=GPU-mon
csecht commented 5 years ago

Sweet. I wasn't familiar with Desktop launchers, so thanks for that introduction. I added this shebang to the file, but don't know if it is necessary:

!/usr/bin/env xdg-open

In any case, it works!

csecht commented 5 years ago

On one of my hosts recently, the cron job to run PAC scripts @reboot stopped working. While trying to debug that I came across an alternative that may be more robust and perhaps could be automated with an amdgpu-utils module. I set up a systemd service to run the PAC bash files at startup. These are the steps I used: After executing amdgpu-pac --force_write, I changed ownership and permissions to: -rwxr-xr-x 1 root craig, for both bash files (2 GPUs) in /home/craig. In the same directory, I made links for each, named pac_writer_card1 and pac_writer_card2; the links are used in the startup service. Created a file, amdgpu-pac-startup.service, with this content:

[Unit]
Description=run at boot amdgpu-utils PAC scripts to configure GPUs

[Service]
Type=oneshot
ExecStart=/home/craig/pac_writer_card1
ExecStart=/home/craig/pac_writer_card2

[Install]
WantedBy=multi-user.target

Then the following from the terminal:

sudo chown root:root amdgpu-pac-startup.service 
sudo mv amdgpu-pac-startup.service /etc/systemd/system/
sudo chmod 664 /etc/systemd/system/amdgpu-pac-startup.service
sudo systemctl daemon-reload
sudo systemctl enable amdgpu-pac-startup.service

The last command produced this output: Created symlink /etc/systemd/system/multi-user.target.wants/amdgpu-pac-startup.service → /etc/systemd/system/amdgpu-pac-startup.service. It worked without a hitch the first time I rebooted and after a power off. If you think this is a worthwhile approach, then I can add it as an alternative to the crontab @reboot approach in the Autostart section of the User Guide.

csecht commented 5 years ago

I accidentally closed this issue when I posted my previous comment. So, what do you think of the systemd approach?

csecht commented 5 years ago

I've realized a problem with this approach. On each startup or following a system update, for the GPU that is device0 (top, first PCIe slot), the card index can change between card0 and card1. The second GPU, device1, in the other PCIe slot, is always designated card2. So when a script for card1 is run at startup and the system has set it as card0, or visa versa, the script has no effect and the GPU runs with its default settings. To get around this, I've set up redundant bash scripts for card0 and card1 and have both execute in the systemd startup service; if there isn't a card index for one, there will be for the other and the PAC settings will take effect. This seems an awkward and probably not universal solution. Is there a way to query a card's index at startup and plug that into the card's bash script as a variable value?

Ricks-Lab commented 5 years ago

Is there enough information in amdgpu-ls to get what you need? It doesn't know anything about slot numbers. It just knows PCIe ID and GPU Card Number used by the driver.

csecht commented 5 years ago

Yes, amdgpu-ls has what is needed. I’m assuming that the PCIe ID is static, while the Card Number is what can change.

What I was referring to as slots relates to the PCIe identifiers printed on the motherboard. So the upper GPU card slot is labeled PCIEX16_1 and the lower GPU slot is PCIEX16_2; I assume these positions correspond to PCIe IDs 1:00.0 and 2:00, respectively. (No other PCIe devices are plugged into the mobo.) In my current system, amdgpu-ls lists these IDs, respectively, as Card Number 0 and Card Number 2. When the driver again decides to change the card numbers, I suppose PCIe ID 1:00.0 will remain card 0 and PCIe ID 2:00.0 will be then associated with card 1, but who knows. I’ll confirm with amdgpu-ls when a change occurs.

On Sep 8, 2019, at 5:48 AM, Rick notifications@github.com wrote:

Is there enough information in amdgpu-ls to get what you need? It doesn't know anything about slot numbers. It just knows PCIe ID and GPU Card Number used by the driver.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/Ricks-Lab/amdgpu-utils/issues/23?email_source=notifications&email_token=ALMVCQQMJX3VICLUSNICTULQITJ6TA5CNFSM4HNYQX42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6FNBFY#issuecomment-529191063, or mute the thread https://github.com/notifications/unsubscribe-auth/ALMVCQWAB4F36NSHWZWP5HTQITJ6TANCNFSM4HNYQX4Q.