A PR to clean up various issues, mostly with files, causing problems in buildkite builds and tests (especially when run more than once). Specifically:
Removes file size validation. This was not used in benchmarks themselves, only when loading source data, so this should not affect benchmarks. If we're really worried about file integrity, let's test this some other way that causes fewer headaches.
Changes cache format to parquet so we can specify more options when writing and reading
Sets coerce_timestamps="us" when writing parquet to fix timestamp precision consistency issues
Reverts timestamp precision change in asserts because this fixes the issue globally
Specifies schemas on read, because type inference fails when columns are all null (common in sample data subsets)
Adds the DRY_RUN env var as in #115 because I'm sick of the mess it makes when trying to hit a conbench instance that doesn't exist when testing
Passes all tests (clear cache, then 2x) and then running this set of benchmarks via conbench with ALL or equivalent for source.
A PR to clean up various issues, mostly with files, causing problems in buildkite builds and tests (especially when run more than once). Specifically:
coerce_timestamps="us"
when writing parquet to fix timestamp precision consistency issuesDRY_RUN
env var as in #115 because I'm sick of the mess it makes when trying to hit a conbench instance that doesn't exist when testingPasses all tests (clear cache, then 2x) and then running this set of benchmarks via
conbench
withALL
or equivalent for source.