We need a repeatable test suite using a medium sized repository (large enough to have a moderate amount of complexity, multiple partitions (when implemented), yet small enough that we can include the whole thing in the main git repo). This should solve part of #13.
Ideas
create an initial repo, probably via random generators (e.g. sequences example), with many commits (100?)
check several things:
file names
hash sums of files (black box test)
combined state sum of all data
partition sums
using a saved copy, perform some operations, checking the result afterwards:
insertion
deletion
replacement
replacement causing movement to a new partition
a mixed commit with multiple of all above ops
new snapshot creation
...
benchmarks?
Requirements & difficulties
Many things could cause files to change. We need to know whether element data and metadata is the same when the file format changes, commits go in different log files etc. One test will be whether the loaded partitions are identical (including all history).
We also need to handle different partitioning: to compare with/without partitioning and check that different classification rules do not change the data (from point of view of whole repo).
How is this going to work with Cargo? Ideally we want multiple test binaries but sharing code and data files. Use a cfg to put shared code in the main library? What about temporary data generated during testing — where does that go and when is it cleaned up?
We need a repeatable test suite using a medium sized repository (large enough to have a moderate amount of complexity, multiple partitions (when implemented), yet small enough that we can include the whole thing in the main git repo). This should solve part of #13.
Ideas
Requirements & difficulties
Many things could cause files to change. We need to know whether element data and metadata is the same when the file format changes, commits go in different log files etc. One test will be whether the loaded partitions are identical (including all history).
We also need to handle different partitioning: to compare with/without partitioning and check that different classification rules do not change the data (from point of view of whole repo).
How is this going to work with Cargo? Ideally we want multiple test binaries but sharing code and data files. Use a
cfg
to put shared code in the main library? What about temporary data generated during testing — where does that go and when is it cleaned up?