boostorg / json

A C++11 library for parsing and serializing JSON to and from a DOM container in memory.
https://boost.org/libs/json
Boost Software License 1.0
428 stars 93 forks source link

benchmark logging #1026

Open sdarwin opened 1 month ago

sdarwin commented 1 month ago

Testing a new benchmark machine. "The (Hetzner) EX44 Dedicated server’s hidden gem is the Intel® Core™ i5-13500 processor."

Ubuntu 24.04. CPU year 2023.

After numerous optimizations, this is an example benchmark:

Screenshot from 2024-07-17 17-44-40

It appears there are two instances of unknown disruptions even though most services have been locked down. Nothing else would be using that processor since it's isolated by cgroups. The newest intel processors may only be taking the scaling frequencies as a suggestion, and occasionally override the choice.

The benchmark command is ./bench -i:b *.json.


The following idea may or may not help, it would be an experiment. Could the benchmark executable bench itself be enhanced, by calling external shell commands, before and after each file (apache_builds, canada, citm_catalog, etc.) ?

Log output to /tmp/json-benchmarks.log

Before processing a json file.

date +'%Y/%m/%d %H:%M:%S:%3N' >> /tmp/json-benchmarks.log
echo "Starting to process canada.json with clang" >> /tmp/json-benchmarks.log
lscpu -e >> /tmp/json-benchmarks.log
top -b -n 1 | head -n 20 >> /tmp/json-benchmarks.log

(run benchmark)

After processing a json file.

date +'%Y/%m/%d %H:%M:%S' >> /tmp/json-benchmarks.log
echo "Finished processing canada.json with clang" >>  /tmp/json-benchmarks.log
lscpu -e >> /tmp/json-benchmarks.log
top -b -n 1 | head -n 20 >> /tmp/json-benchmarks.log
sdarwin commented 1 month ago

If it's simpler to implement, not all of the above commands are required.

Imagine another shell script running in the background on the server, that once-per-second repeatedly measures the cpu frequency, top output (to view processes), and other diagnostics. Then, bench doesn't need to be able to run those. What is important is to correlate the start and stop times of the individual benchmarks with the diagnostics.

Then bench doesn't need to call any external scripts. (Well, if it could mention the current git commit, that might be interesting. But not critical at all.)

Considering further, perhaps the output is similar to what it's already outputting:

Parse mesh.json,clang x64/sse2,boost (pool),3200,5006,441
Parse mesh.json,clang x64/sse2,boost (pool),3339,5225,441
Parse mesh.json,clang x64/sse2,boost (pool),3339,5221,441
Parse mesh.json,clang x64/sse2,boost (pool),3339,5220,441
Parse mesh.json,clang x64/sse2,boost (pool),3339,5218,442

except,

2024-07-18-08:10:32:725 Starting Parse mesh.json,clang x64/sse2,boost (pool)
2024-07-18-08:10:33:822 Completed Parse mesh.json,clang x64/sse2,boost (pool),3200,5006,441
2024-07-18-08:10:34:321 Starting Parse mesh.json,clang x64/sse2,boost (pool)
2024-07-18-08:10:35:425 Completed Parse mesh.json,clang x64/sse2,boost (pool),3339,5225,441
grisumbras commented 1 month ago

Ok, I can do that.