N8-CIR-Bede / documentation

Documentation for the N8CIR Bede Tier 2 HPC faciltiy
https://bede-documentation.readthedocs.io/en/latest/
7 stars 11 forks source link

Performance monitoring scripts #164

Open ptheywood opened 1 year ago

ptheywood commented 1 year ago

During the knowledge exchange day (2023-01-13) it was raised that tracking GPU utilaistiaon / power usage within a job could be useful.

This could be achieved by backgrounding a script which polls nvidia-smi / dgsmi, which records gpu use to disk for the job.

An example of how to do this and a script for this could be documented / provided.

This type of metric can be very useful for software developers going forwards, to understand the impact of thier jobs.