edisj / Comet

0 stars 0 forks source link

Code proofread #1

Open edisj opened 3 years ago

edisj commented 3 years ago

Hi @orbeckst,

I've created separate git repositories for each cluster since things were getting a little too messy pushing and pulling from 4 different sources. I've linked the code I used to benchmark on Comet and an example notebook of how I took the averages.

If you get a chance can you please scan over it? I've been over it many times and don't think there are any errors, but a second set of eyes would give me peace of mind. We can also go over it on Wednesday so no worries either way

Here's the actual benchmark function: https://github.com/edisj/Comet/blob/6662ba7a50d01018d58b3b979f3f45e48829e545/benchmarks/1-full_IO/scripts/full_IO_bench.py#L23-L122

I can't figure out how to link specific lines in the jupyter notebook, so here's a link to it: https://github.com/edisj/Comet/blob/main/benchmarks/example_analysis.ipynb

Can you please look at the functions reduce_to_means() and all_process_dataframe().

reduce_to_means() loads the raw data arrays into the _dict dictionary, and goes through each repeat and takes the average across all ranks. Then it takes the average and std dev. across the repeats.

all_process_dataframe() initializes an (N_process x timings) matrix, and fills in each row by using the reduce_to_means() function to get the averaged times for each N process run.

Once I have the data averaged in a nice table, I plot the timings by extracting the columns I'm interested in.

Thank you! Edis

orbeckst commented 3 years ago

README

full_IO_bench.py

notebook