Description of changes:
This PR adds the execution of nvidia-bug-report.sh in the eks-logs-collector. This executable is part of the Nvidia drivers and is useful for debugging. Script is alsot mentioned in https://docs.nvidia.com/deploy/gpu-debug-guidelines/index.html
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Testing Done
I tested this script on a g4dn instance which has an Nvidia GPU, and verified that the log.gz file created by nvidia-bug-report.sh is included in the log collector archive.
Trying to Collect CPU Throttled Process Information...
Trying to Collect IO Throttled Process Information...
Trying to Collect Nvidia Bug report...
Trying to archive gathered information...
Done... your bundled logs are located in /var/log/eks_i-...tar.gz
Also ran the script against a t3.large to make sure the script doesn't break -
Trying to Collect CPU Throttled Process Information...
Trying to Collect IO Throttled Process Information...
Trying to Collect Nvidia Bug report... No Nvidia drivers found, nothing to do.
Trying to archive gathered information...
Done... your bundled logs are located in /var/log/eks_i-....tar.gz
See this guide for recommended testing for PRs. Some tests may not apply. Completing tests and providing additional validation steps are not required, but it is recommended and may reduce review time and time to merge.
Issue #, if available: N/A
Description of changes: This PR adds the execution of
nvidia-bug-report.sh
in the eks-logs-collector. This executable is part of the Nvidia drivers and is useful for debugging. Script is alsot mentioned in https://docs.nvidia.com/deploy/gpu-debug-guidelines/index.htmlBy submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Testing Done
I tested this script on a g4dn instance which has an Nvidia GPU, and verified that the log.gz file created by
nvidia-bug-report.sh
is included in the log collector archive.Also ran the script against a t3.large to make sure the script doesn't break -
See this guide for recommended testing for PRs. Some tests may not apply. Completing tests and providing additional validation steps are not required, but it is recommended and may reduce review time and time to merge.