darshan-hpc / darshan

Darshan I/O characterization tool
Other
56 stars 27 forks source link

ENH: make DXT-based heatmap generation opt-in for summary tool #932

Closed shanedsnyder closed 1 year ago

shanedsnyder commented 1 year ago

Generates a Darshan Summary Report

positional arguments: log_path Specify path to darshan log.

optional arguments: -h, --help show this help message and exit --output OUTPUT Specify output filename. --enable_dxt_heatmap Enable DXT-based versions of I/O activity heatmaps.



- display warning message if DXT module available but not plotted
![image](https://github.com/darshan-hpc/darshan/assets/24571836/c185590a-cd59-4c42-9e39-e852a601d9f2)

- update logic for reading in log records for summary tool to only read in modules that are necessary (_generic_ modules and heatmap module by default, and only DXT modules if `--enable_dxt_heatmap` is specified)
  - this required a bug fix in DXT heatmap plotting code to not expect DXT records are loaded just because they are listed in `report.modules` -- users can read in record data on a module granularity so we can't expect the records are available just because the module is listed in the log file
  - some quick report generation timing info from my laptop:
    - `e3sm_io_heatmap_only`: 8.266s
    - `e3sm_io_heatmap_and_dxt` (with `--enable_dxt_heatmap`):  34.892s
    - `e3sm_io_heatmap_and_dxt` (default no heatmap): 14.361s
    - `e3sm_io_heatmap_and_dxt` (default no heatmap, with above change): 8.169s
    - so, default timing is way down, and with improvement from above it is on par with the `e3sm_io_heatmap_only` case

- update existing tests to account for DXT heatmap suppression, as well as add new tests checking expected values when `--enable_dxt_heatmap` is specified and also checking for the warning message from above when appropriate
shanedsnyder commented 1 year ago

This occurred to me while looking at this (though I'm not addressing it here), but I'm not really sure there's much value in having 2 types of heatmaps side-by-side (i.e., DXT-based and traditional) for each module. That might be helpful for us sanity checking things as a developer, but as a user it probably confusing trying to understand the differences between them and what the point is in displaying both. The differences certainly aren't conveyed in the existing report, so users would have to be pretty knowledgeable about Darshan modules to get it.

I think it'd look much cleaner and organized just to do one heatmap per-line so that users can quickly vertically scan I/O activity across different APIs. We would default to heatmap module data only, but users could force the use of finer-grained DXT data for the heatmap generation using the option introduced in this PR.