DIAGNijmegen / nnUNet

Apache License 2.0
3 stars 14 forks source link

Adding basic dockerized testing environment system #3

Closed silvandeleemput closed 2 years ago

silvandeleemput commented 2 years ago

Relates to #2

I created a basic lightweight self-bootstrapping testing system for the nnUNet repository:

The pytests that I have added so far are:

  1. tests for prediction
  2. tests for downloading pretrained models
nlessmann commented 2 years ago

Looks good overall!

silvandeleemput commented 2 years ago
  • The images that you added for the tests, can we add those to a public repository?
  • I don't really get why you download the model both in the run_tests.sh script and in one of the tests (test_nnunet_download_pretrained). Wouldn't it be possible to combine this and download the files only once, and then only check if the correct files are there in the test?

It's just a test for the nnunet downloading mechanism, to ensure that the download mechanism works and the correct files are downloaded. Afterward, the files downloaded by the test are removed again by pytest.

nlessmann commented 2 years ago
  • The images that you added for the tests, can we add those to a public repository?

I am not sure I fully get what you mean by this question. We can of course push the docker image to the diag docker repo if you like. Or do you mean something more public like dockerhub?

No, I mean are these images that we can add here since this is a public repository? Or are these subject to some license that does not allow us to "publish" them here?

  • I don't really get why you download the model both in the run_tests.sh script and in one of the tests (test_nnunet_download_pretrained). Wouldn't it be possible to combine this and download the files only once, and then only check if the correct files are there in the test?

It's just a test for the nnunet downloading mechanism, to ensure that the download mechanism works and the correct files are downloaded. Afterward, the files downloaded by the test are removed again by pytest.

Ah okay, but could you then not just test if the right files exist in the resources folder since you already downloaded them into that folder using the same functions from nnunet?

silvandeleemput commented 2 years ago

No, I mean are these images that we can add here since this is a public repository? Or are these subject to some license that does not allow us to "publish" them here?

Good question. The two input images are a subset of the Task04_Hippocampus dataset under the Creative Commons Attribution-ShareAlike 4.0 International Public License I think... the rest are derived images (segmentations) so these should be fine I think.

The Creative Commons Attribution-ShareAlike 4.0 International Public License, states that it is ok to copy and redistribute, but it must be redistributed under the same license and give appropriate credit:

If supplied, you must provide the name of the creator and attribution parties, a
copyright notice, a license notice, a disclaimer notice, and a link to the material.
In 4.0, you must indicate if you modified the material and retain an indication of
previous modifications. 

So, I think it should be fine as long as we add the proper license and attribution.

Source license explanation: https://ceili.stanford.edu/License-Summary.pdf

Source data with license: https://drive.google.com/drive/folders/1HqEgzS8BV2c7xYNrZdEAnrHk7osJJ--2

silvandeleemput commented 2 years ago

Ah okay, but could you then not just test if the right files exist in the resources folder since you already downloaded them into that folder using the same functions from nnunet?

While that's true to some extent, this comes with a risk: Normally you would setup your testing environment only once, once this is done then all the tests should pass. Now after developing for some time and you accidentally break the nnUNet download code, when you only check for file existence, the test will pass, but the download code would still be broken. In the current form of testing, the broken download code will be detected by a failing test.

I agree that it seems a bit excessive, but I think it is the more careful testing approach.

Note that if performance is an issue (since downloading gives some overhead) you can run only a subset of the pytests and exclude this particular test.