darshan-hpc / darshan

Darshan I/O characterization tool
Other
55 stars 27 forks source link

ENH: more Pythonic datastructure for derived_metrics #924

Open shanedsnyder opened 1 year ago

shanedsnyder commented 1 year ago

Currently, the derived_metrics value returned by PyDarshan's accumulate_records() routine is just a cdata object matching the C-level struct representation of derived metrics. We are using some less than ideal methods for working with this data requiring us to have knowledge of enum values, etc: https://github.com/darshan-hpc/darshan/blob/05efd1dac6814ab94b7e5f4d442cdf4cf8fd82cd/darshan-util/pydarshan/darshan/lib/accum.py#L64

We should see if we can make the data more Pythonic so it's more straightforward to use for PyDarshan consumers. Tyler had some quick suggestions elsewhere:

It is true that I don't like how there is a struct than then contains an array which has to be manually indexed based on knowledge of the C code to pull out some counter values. Maybe sketch out a design in another issue if you plan to flatten to a custom class or maybe even dataclass is appropriate... though dunno where it sits on priority side.