spinalcordtoolbox / spinalcordtoolbox

Comprehensive and open-source library of analysis tools for MRI of the spinal cord.
https://spinalcordtoolbox.com
GNU Lesser General Public License v3.0
199 stars 101 forks source link

Add full-sized images to `sct_testing_data` (or combine with `sct_example_data`) #3471

Open joshuacwnewton opened 3 years ago

joshuacwnewton commented 3 years ago

Currently, SCT maintains 2 different datasets for testing

The problem is, sometimes during testing, we really do want access to the full-sized images (see for example https://github.com/spinalcordtoolbox/spinalcordtoolbox/pull/3468#discussion_r671696767), and sct_testing_data starts to feel unrepresentative of real-world data.


At that point, if we are including full-sized images in sct_testing_data, then is there still a benefit is there to keeping the two datasets separate? I'm wondering if it would just be easier to keep both types of data (raw, processed) in the same dataset, then use that dataset for both batch_processing.sh and our test suite.

If we do merge them, we would have to think about dataset structure. For example, maybe we could follow a BIDS-like approach and use a derivatives folder for the pre-processed images?

joshuacwnewton commented 3 years ago

One problem with this is that we're currently redownloading the test dataset on every pytest run. If we increase the size of sct_testing_data, then redownloading starts to become a problem.

But, this is an issue we've already wanted to address, see: https://github.com/spinalcordtoolbox/spinalcordtoolbox/issues/2959.

joshuacwnewton commented 3 years ago

One problem with this is that we're currently redownloading the test dataset on every pytest run. If we increase the size of sct_testing_data, then redownloading starts to become a problem.

This was fixed by #3480. :slightly_smiling_face: