ukri-excalibur / excalibur-tests

Performance benchmarks and regression tests for the ExCALIBUR project
https://ukri-excalibur.github.io/excalibur-tests/
Apache License 2.0
18 stars 15 forks source link

Picking columns to export CSV #260

Closed kaanolgu closed 4 months ago

kaanolgu commented 7 months ago

Fixes #252

series: [["partition", "cascadelake"],["partition", "volta"]]


- Then it would print ( commenting line `self.plot_generic( config["title"], df[columns][mask], config["x_axis"], config["y_axis"], series_filters)` [thanks @pineapple-cat ] ) : 

Selected dataframe: tags Triad_value Triad_unit spack_spec partition 0 acc 12963.337 MBytes/sec babelstream%gcc@13.1.0 +acc cascadelake 1 acc 13386.104 MBytes/sec babelstream%gcc@9.2.0 +acc cuda_arch=70 volta 2 cuda 846640.634 MBytes/sec babelstream%gcc@9.2.0 +cuda cuda_arch=70 volta 3 omp 159131.926 MBytes/sec babelstream%gcc@13.1.0 +omp cascadelake 4 tbb 99740.216 MBytes/sec babelstream%gcc@13.1.0 +tbb partitioner=auto cascadelake


 Is there any alternative method to export multiple columns that I might be missing ? Open to any feedback or suggestions 

 Thank you!
kaanolgu commented 7 months ago

https://github.com/ukri-excalibur/excalibur-tests/commit/438be0257c39cf81bb1260e2a881559c2c10df3c

Made a change according to idea from @ilectra and @pineapple-cat . Instead of adding a new "*_axis" to the yaml file, a new list of columns to extract to csv dataframe is used and df_csv_export is generated from the original dataframe.

One question would be is it required to apply user-specified types to all relevant columns for the csv export too ?

ilectra commented 7 months ago

I think that, instead of creating a whole new df to export to csv, it would make more sense if you

kaanolgu commented 7 months ago

I think that, instead of creating a whole new df to export to csv, it would make more sense if you

  • add only the extra columns that you need in the csv_export part of the yaml, not all the axis and series etc.
  • keep those columns in the filtered dataframe, as you go along the processing/filtering, treating them as extra axes.
  • export the filtered df to csv, just before calling the plotting script (you won't call it, in your case, but that's the place to do the printing)

That's a great idea! I will try to implement this in a new commit

ilectra commented 7 months ago

I think that, instead of creating a whole new df to export to csv, it would make more sense if you

  • add only the extra columns that you need in the csv_export part of the yaml, not all the axis and series etc.
  • keep those columns in the filtered dataframe, as you go along the processing/filtering, treating them as extra axes.
  • export the filtered df to csv, just before calling the plotting script (you won't call it, in your case, but that's the place to do the printing)

That's a great idea! I will try to implement this in a new commit

@kaanolgu you might want to wait for the refactoring PR to be merged. It will change things quite a bit, hopefully to the better!

ilectra commented 4 months ago

@pineapple-cat @kaanolgu I think I addressed all my review comments, please have a last look and merge if happy!