LSSTDESC / DC2-production

Configuration, production, validation specifications and tools for the DC2 Data Set.
BSD 3-Clause "New" or "Revised" License
11 stars 7 forks source link

Unit testing and continuous integration for scripts/ #329

Open JulienPeloton opened 5 years ago

JulienPeloton commented 5 years ago

The modules and scripts sitting under scripts are not tested (in the sense unit testing) and there is no continuous integration set. I guess, based on the documentation, some form of tests are ran externally and manually, but it would be good to automatize this.

Probably the difficulty here would be to define an environment for tests (i.e. not dependent on files at NERSC), and I would be happy to have a look at this, unless there is a good reason to not do it.

wmwv commented 5 years ago

This is a very good idea that I haven't figured out how to do.

What's the minimum size dataset we would need to keep around to test these? Where should that test dataset live? Do we explicitly keep copies at NERSC and IN2P3 in pre-set paths?

What level of backwards compatibility should we strive for? I think I'd like to keep flexibility to make breaking changes (e.g., I don't think we need to support the ability to regenerate everything for Run 1.1 from the latest master of DC2-production in long term), but agree that testing will mean that we only make intentionally breaking changes, not changes that break stuff for silly reasons.

JulienPeloton commented 5 years ago

Thanks @wmwv for the detailed input! I will write down some initial proposal based what you list, and report here.

One more question though: in case we want to get a proper CI running (e.g. using Travis CI), we would need to isolate a small subset of the dataset to be used from outside of NERSC and CCIN2P3. This subset could be put in git LFS, or similar, but naively it would be made public such that one can access it easily from within the CI. Is that possible (policy-wise)? Is that what we want? Or NERSC and CCIN2P3 are the only targets for tests?

yymao commented 5 years ago

We had explored the possibility of running CI tests on NERSC (see https://github.com/LSSTDESC/gcr-catalogs/issues/50) but had not achieved an fully working solution. Maybe we can revisit this?

JulienPeloton commented 5 years ago

Thanks @yymao, this would be indeed a perfect solution if it is possible. And once it is done once, it would be a great resource for all other repos.

Apparently this is already possible for repo on Gitlab thanks to their CI service (see https://www.nersc.gov/assets/Uploads/2017-02-06-Gitlab-CI.pdf), and let me investigate this possibility for a repo on github.

katrinheitmann commented 2 years ago

Needs follow up from CO WG