GMLC-TDC / helics_benchmark_results

Repo containing helics_benchmark results and processing code
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

messageSendResults non-standard format #1

Closed trevorhardy closed 4 years ago

trevorhardy commented 4 years ago

All tests that include the core type in the benchmark name are formatted with "multiCore" before the first "/" delimiter:

BM_mgen_multiCore/inprocCore/4096/2/real_time

"messageSendResults" formats things slightly differently which makes extracting the core type form the benchmark name unnecessarily tricky (especially for a programmer like me).

"BM_sendMessage/tcpMultiCore/1/64/iterations:1/real_time"

Can the formatting of "messageSendResults" be updated to conform to the other benchmarks? If so, I'll update the benchmark names for the existing files.

phlptp commented 4 years ago

I can make that change. you would probably see a few more new results with the current output though before the update gets merged.

trevorhardy commented 4 years ago

Sounds good. I'll update the files in the repository now.

trevorhardy commented 4 years ago

To be clear, here are the changes I'm making:

"BM_sendMessage/singleCore/1/1/iterations:1/real_time" becomes "BM_sendMessage_singleFed/1/1/iterations:1/real_time",

and

"BM_sendMessage/tcpssMultiCore/64/1/iterations:1/real_time" becomes "BM_sendMessage_multiCore/tcpssCore/64/1/iterations:1/real_time"

nightlark commented 4 years ago

Oh I remember why I did it this way -- the BM_sendMessage part is the name of the function called, and there is a large amount of duplicate code in the other benchmarks that use two separate functions. I think it also seemed like it might be easier to work with the single function version for adapting it from a single machine benchmark to a multimachine benchmark.

phlptp commented 4 years ago

Would this be better BM_sendMessage/singleCore/1/1/iterations:1/real_time becomes BM_sendMessage/singleFed/1/1/iterations:1/real_time,

and

BM_sendMessage/tcpssMultiCore/64/1/iterations:1/real_time becomes BM_sendMessage/multiCore/tcpssCore/64/1/iterations:1/real_time

With the single function call for all tests I don't quite know how to rename them right now without the extra / after BM_sendMessage

nightlark commented 4 years ago

The BM_*/[multi|single]Core/*Core format seems like it should be convenient for parsing, with / as a delimiter for easily getting the benchmark name and core setup.

trevorhardy commented 4 years ago

That works for me.

If this is the new standard I'm fine with that BUT....

It will take more work to retroactively change the existing results files. I can do that but it will take time. Any way to work around this or do I just need to bite the bullet?

trevorhardy commented 4 years ago

And if we're going to monkey with the format, could we add in extra meta-data to make it clear what the numeric values in the name mean (/8/ becomes federates:8 or federates=8)?

If we do all these changes to the naming convention, is it faster to re-run the benchmarks or have me write a script to update the benchmark names. I'm guessing the later is four or so hours or work (which I'm happy to do); how long would it take to re-run the benchmarks?

phlptp commented 4 years ago

most of that line is generated automatically be google benchmarks so we don't have a lot of control over how it gets generated. The first part is the name of the function call, the second part is the specific name of the test which is arbitrary. the numbers then attached by google benchmark so we have no control over that. as far as time. I have been kicking off a benchmark run before leaving in the evening or running overnight, since it usually takes a couple hours and takes over the computer. So I don't want to just throw away the data we have.

trevorhardy commented 4 years ago

Sure, I can understand that.

Assuming you @phlptp are on board with the new benchmark name convention (BM_*/[multi|single]Core/*Core), I'll get to work on a script to change the names of the benchmark.

nightlark commented 4 years ago

Here's some sed stuff that might help with renaming. Running sed with -i will edit the input files in-place.

sendMessage benchmark format: sed -r -e 's~BM_([^\/]*)\/(.*)MultiCore~BM_\1\/multiCore\/\2Core~g' -e 's~BM_([^\/]*)\/singleCore~BM_\1\/singleFed~g'

Other benchmarks format: sed -r -e 's~BM_([^\/]*)_multiCore\/([^\/]*)~BM_\1\/multiCore\/\2~g' -e 's~BM_([^\/]*)_singleCore~BM_\1\/singleFed~g'

trevorhardy commented 4 years ago

Oh, OK. I'll give those a shot; thanks!

trevorhardy commented 4 years ago

I'm able to get the sed command to run but its not making the substitution (I'm using GNU sed on my Mac). I'm going to keep messing around to see if I can get it to behave but if not, I'll do my own script.

If its working for you, @nightlark, and if you're up for it, feel free batch this on your end committing up the changes.

nightlark commented 4 years ago

Of course macOS has sed with differences... https://unix.stackexchange.com/a/131940

I think the difference that stands out the most is that macOS sed uses -E instead of -r for lookbehind/capture groups.

nightlark commented 4 years ago

macOS (/BSD) differences... also needs to have -i with an empty extension.

sendMessage benchmark format: sed -i '' -E -e 's~BM_([^\/]*)\/(.*)MultiCore~BM_\1\/multiCore\/\2Core~g' -e 's~BM_([^\/]*)\/singleCore~BM_\1\/singleFed~g'

Other benchmarks format: sed -i '' -E -e 's~BM_([^\/]*)_multiCore\/([^\/]*)~BM_\1\/multiCore\/\2~g' -e 's~BM_([^\/]*)_singleCore~BM_\1\/singleFed~g'

trevorhardy commented 4 years ago

Perfect! Both of those work for me now; thank you!

(From my other use of sed I knew there were differences and I thought I could avoid all of them by using GNU sed. Maybe that's not the case or maybe I messed something else up.)

trevorhardy commented 4 years ago

All done.

Conversion script ("v1_v2_bm_name_converter.py") is in the scripts folder. We can run this everytime we get more results until the new naming convention kicks in.