Adding basic dockerized testing environment system

silvandeleemput commented 2 years ago

Relates to #2

I created a basic lightweight self-bootstrapping testing system for the nnUNet repository:

Basically all you need to do is to run ./tests/run_tests.sh, this will:
1. Check if a docker image called nnunet_dev_image:latest is present, if not it will attempt to build it.
2. Check if the required test files are already downloaded (300MB), if not download these in the correct place.
3. Run all the pytests using the nnunet_dev_image:latest docker image.

The pytests that I have added so far are:

tests for prediction
tests for downloading pretrained models

nlessmann commented 2 years ago

Looks good overall!

The images that you added for the tests, can we add those to a public repository?
I don't really get why you download the model both in the run_tests.sh script and in one of the tests (test_nnunet_download_pretrained). Wouldn't it be possible to combine this and download the files only once, and then only check if the correct files are there in the test?

silvandeleemput commented 2 years ago

The images that you added for the tests, can we add those to a public repository?

edit * I now think I understand what you mean. We could add the test images to a public repository and download them from there for the tests, however for now I think they are nicely contained within the nnUNet repository, without being very cumbersome (all committed image files are very tiny +/- 1kb/file). Did you want to include the files that are currently downloaded there as well?

I don't really get why you download the model both in the run_tests.sh script and in one of the tests (test_nnunet_download_pretrained). Wouldn't it be possible to combine this and download the files only once, and then only check if the correct files are there in the test?

It's just a test for the nnunet downloading mechanism, to ensure that the download mechanism works and the correct files are downloaded. Afterward, the files downloaded by the test are removed again by pytest.

nlessmann commented 2 years ago

The images that you added for the tests, can we add those to a public repository?

I am not sure I fully get what you mean by this question. We can of course push the docker image to the diag docker repo if you like. Or do you mean something more public like dockerhub?

No, I mean are these images that we can add here since this is a public repository? Or are these subject to some license that does not allow us to "publish" them here?

I don't really get why you download the model both in the run_tests.sh script and in one of the tests (test_nnunet_download_pretrained). Wouldn't it be possible to combine this and download the files only once, and then only check if the correct files are there in the test?

It's just a test for the nnunet downloading mechanism, to ensure that the download mechanism works and the correct files are downloaded. Afterward, the files downloaded by the test are removed again by pytest.

Ah okay, but could you then not just test if the right files exist in the resources folder since you already downloaded them into that folder using the same functions from nnunet?

silvandeleemput commented 2 years ago

No, I mean are these images that we can add here since this is a public repository? Or are these subject to some license that does not allow us to "publish" them here?

Good question. The two input images are a subset of the Task04_Hippocampus dataset under the Creative Commons Attribution-ShareAlike 4.0 International Public License I think... the rest are derived images (segmentations) so these should be fine I think.

The Creative Commons Attribution-ShareAlike 4.0 International Public License, states that it is ok to copy and redistribute, but it must be redistributed under the same license and give appropriate credit:

If supplied, you must provide the name of the creator and attribution parties, a
copyright notice, a license notice, a disclaimer notice, and a link to the material.
In 4.0, you must indicate if you modified the material and retain an indication of
previous modifications.

So, I think it should be fine as long as we add the proper license and attribution.

Source license explanation: https://ceili.stanford.edu/License-Summary.pdf

Source data with license: https://drive.google.com/drive/folders/1HqEgzS8BV2c7xYNrZdEAnrHk7osJJ--2

silvandeleemput commented 2 years ago

Ah okay, but could you then not just test if the right files exist in the resources folder since you already downloaded them into that folder using the same functions from nnunet?

While that's true to some extent, this comes with a risk: Normally you would setup your testing environment only once, once this is done then all the tests should pass. Now after developing for some time and you accidentally break the nnUNet download code, when you only check for file existence, the test will pass, but the download code would still be broken. In the current form of testing, the broken download code will be detected by a failing test.

I agree that it seems a bit excessive, but I think it is the more careful testing approach.

Note that if performance is an issue (since downloading gives some overhead) you can run only a subset of the pytests and exclude this particular test.

DIAGNijmegen / nnUNet

Adding basic dockerized testing environment system #3