PrincetonUniversity / jobstats

GNU General Public License v2.0
50 stars 10 forks source link

Space used by jobstats in Slurm database #2

Closed ssgituser closed 1 year ago

ssgituser commented 1 year ago

Is there any guidance for how much additional space each job record will need in the database that is used by slurmdbd?

plazonic commented 1 year ago

It is not an easy question to answer precisely - the size of each entry is dependent on how many nodes and GPUs are used in a particular job (very short jobs only store 8 chars to indicate that there are no stats). We were concerned about this when we started doing this which is why the job summary json data (in compact form) is compressed before being base64 encoded - so the relationship with number of nodes/GPUs won't exactly be linear but is probably not too far off.

To give you an idea of average record size I checked min size/max size and average size over last million jobs on two of our clusters (think SELECT AVG(LENGTH(admin_comment)) from cluster_job_table where job_db_inx > 111111). One cluster tends to run a lot of small jobs (including GPUs) and the other runs less large jobs: Cluster Min Max Avg
Smaller jobs 8 1476 39.4
Larger jobs 8 4804 37.3

Note that a lot of short jobs can skew the average down significantly.

Overall this is still significantly more compact than long term storage of prometheus data.

ssgituser commented 1 year ago

Thank you for the information. This will be helpful in planning the deployment of the jobstats package