spiraldb / vortex

An extensible, state-of-the-art columnar file format
https://vortex.dev
Apache License 2.0
980 stars 25 forks source link

Fuzz statistics calculations #1307

Open lwwmanning opened 4 hours ago

lwwmanning commented 4 hours ago

We should fuzz statistics by computing all possible stats on an array via array.statistics.compute_all(all::<Stat>()) or some such, canonicalizing that array, and then calling compute_statistics on the canonicalized array (one by one).

If the statistic is present in the first set, it should also be present and with equal value in the second set.

gatesn commented 3 hours ago

Doesn't that break when you hard-code canonicalize to copy over all stats?

lwwmanning commented 2 hours ago

Nah, because compute_statistics bypasses the cached values altogether