rivosinc / prometheus-slurm-exporter

Export select slurm metrics to prometheus
Apache License 2.0
26 stars 5 forks source link

[cli-fallback] job failed to parse mem string 0 #33

Closed abhinavDhulipala closed 7 months ago

abhinavDhulipala commented 7 months ago

Apparently mem can be a integer or at least 0. We should change the default behavior to Inc the error metric but also continue for line by line json parsing instead of returning no data at the first error

Error log:

time=2023-11-15T08:24:13.056-08:00 level=ERROR msg="job failed to parse with \"mem string 0 doesn't match regex ^(?P<num>([0-9]*[.])?[0-9]+)(?P<memunit>G|M|T)>
KasperSkytte commented 7 months ago

I second this error, newest slurm version 23.02

abhinavDhulipala commented 7 months ago

got the bug fixed, still should probably change the logic to line by line erroring, but this works for now.