Closed grlee77 closed 3 years ago
Not sure if @constantinpape has any thoughts on the original design, but my vote would also be for no data files in the repo. (And if the previous files are too large in the history, stripping them)
A half-way point might be storing checksums.
Thanks for raising this @grlee77.
Not sure if @constantinpape has any thoughts on the original design, but my vote would also be for no data files in the repo. (And if the previous files are too large in the history, stripping them)
Initially, I did add all the data in order to just have some reference files for the zarr / n5 data format online. This repo then evolved and I agree that it makes more sense now to not store all the data any more.
However, I think it would still be useful to have some reference data here, which could be used by some (external) tools for static checks or similar.
So I would propose to only keep the data / add new data if it corresponds to a different spec version (with different compressors).
So, for now this would be data.zr
, data.n5
, data.z3
. Instead of having them under data
we could add a new folder example
for this to avoid the issues @grlee77 reported.
I wanted to raise a question of whether we should be storing any of the generated binary files in this repository? I did add them in #24 based on prior examples, but I see that the recent xtensor-zarr PR did not.
One annoyance is that if I run
make data
locally it will cause git to consider all of these files to be updated (although we could fix that with.gitignore
). Another, is that if I run the tests locally using the files stored in the repo (generating only the absent xtensor-zarr data viamake xtensor-zarr
), I see the following failure:However, all tests pass if I regenerate the data with
make data
rather than using the ones stored in the repository.