darshan-hpc / darshan

Darshan I/O characterization tool
Other
56 stars 27 forks source link

logutils workaround for heatmap time skew #945

Closed carns closed 1 year ago

carns commented 1 year ago

This is the utility side counter part to #942.

This will "repair" logs on the fly that were generated under the conditions described in https://github.com/darshan-hpc/darshan/issues/941 so that the heatmap bin count stays uniform/rectangular despite slight timing skew across ranks. This condition is only possible for logs generated with Darshan 3.4.3 with shared record reduction explicitly disabled.

@nafi3 are you able to confirm if you can parse existing logs that have mismatched heatmap bin problems? (not the negative value problem; that is something different).

Fixes #941

Nafi3 commented 1 year ago

This is the utility side counter part to #942.

This will "repair" logs on the fly that were generated under the conditions described in #941 so that the heatmap bin count stays uniform/rectangular despite slight timing skew across ranks. This condition is only possible for logs generated with Darshan 3.4.3 with shared record reduction explicitly disabled.

@Nafi3 are you able to confirm if you can parse existing logs that have mismatched heatmap bin problems? (not the negative value problem; that is something different).

Fixes #941

@carns can confirm. I don't see the error anymore with the old logs. :)

carns commented 1 year ago

Just force pushed an update that squashes the two commits and corrects the log message to reference the right versions.

shanedsnyder commented 1 year ago

LGTM, merging.