facebookresearch / fastMRI

A large-scale dataset of both raw MRI measurements and clinical MRI images.
https://fastmri.org
MIT License
1.3k stars 372 forks source link

Subsample dataset by volume fix #122

Closed z-fabian closed 3 years ago

z-fabian commented 3 years ago

If sample_rate < 1.0, SliceDataset subsamples the training dataset by slices and not by volumes. This is undesired since

  1. Based on the function documentation says sample_rate should control what fraction of the volumes should be loaded.
  2. Subsampling by volumes makes more sense in practice. Randomly selected slices can create very unbalanced datasets (possibly lots of empty slices) and subsampling by volumes mimics the real life case of having less MRI volumes for training.

This fix modifies SliceDataset to subsample by volumes when sample_rate < 1.0. If subsampling by slice is the desired behavior, then we should update the fuction helper to reflect that.

z-fabian commented 3 years ago

@mmuckley As you suggested I added a new argument volume_sample_rate. If it is not set, sample_rate will behave the same way as before. If it is set, but sample_rate is not, then we subsample by volumes. I changed the default values for both of these arguments to None, in which case 1.0 will be used. Should be 100% backwards compatible.