brimdata / zed

A novel data lake based on super-structured data
https://zed.brimdata.io/
BSD 3-Clause "New" or "Revised" License
1.34k stars 67 forks source link

Re-baseline perf-compare with updated sample data and on t3.2xlarge #5111

Closed philrz closed 2 months ago

philrz commented 2 months ago

As mentioned in https://github.com/brimdata/zed-sample-data/pull/41, now that the zed-sample-data has been regenerated with Zeek v6.2.0, this PR adjusts the perf-compare automation to include a call to quiet() to avoid a failure that would occur if the script tried to cut ts on the newly-added loaded_scripts logs.

Since the data set has changed, a re-run of the numbers with current Zed was justified. Since we were re-baselining anyway, I also took the liberty of doing the new run on a t3.2xlarge instead of the prior t2.2xlarge (so, same specs, but different underlying hardware) since online consensus seems to be that AWS is always trying to nudge people toward the newer instance types so they can eventually sunset the older ones. This helps save a few pennies and avoid a forced instance move at a less convenient time.

Between the new data and the different instance type, the perf numbers in the diff are not really apples-to-apples. That said, I did eyeball them on the assumption they should be close, and for the most part they are, though a bit slower in most cases. I'm not too worried since Autoperf now does a better job of catching our perf regressions whereas this perf-compare automation has become more about catching bugs related to reading and writing in the many supported formats.