TACC / tacc_stats

TACC Stats is an automated resource-usage monitoring and analysis package.
GNU Lesser General Public License v2.1
41 stars 15 forks source link

Dealing with job arrays #15

Closed espenfl closed 1 year ago

espenfl commented 4 years ago

Currently there is an issue with handling job arrays of SLURM. It typically appear in update_db, but manifests itself here and there in the code base.

Typically, the job id field have entries like xxxxx_y or xxxxx_[y-z] etc. Allowing x+y etc. would most likely not work as we would potentially open for collisions. Also, the user can often specify y, z etc.

Is the way forward to calculate some kind of unique id or is there already mechanisms in place in the code for handling this?

espenfl commented 4 years ago

Alternatively, maybe we should just update the id field to be a string instead of an integer?

rtevans commented 4 years ago

Thanks for bringing this issue to my attention. We don't use job arrays much here so it hasn't come up. There are occasionally some job arrays in our data (1 array every few days) but it doesn't look like we are processing them correctly. We certainly can make the id field a string and that would solve the problem from the database's perspective.

I don't know how the job id for each step appears in the raw stats files though. The job id used in the raw stats file is set during SLURM prologue so this may or may not be working. I will look into this. It's a capability we should have.

espenfl commented 4 years ago

Thanks a lot for the quick reply and your offer to look into this. Greatly appreciated.

espenfl commented 4 years ago

Did you find time to have a look at this?

stephenlienharrell commented 1 year ago

Job arrays are shown as jobid_arrayid. I am not 100% sure that this meets the requirement, but this is a very old request so I am not sure it is still relevant.