Improve data loading performance

Problem

Data loading during training was slow. Most of the time is spent on I/O and augmentation.

Performance tweaks

Disabled MONAI's metadata tracking (see tutorial here)
Allow decompressing the dataset ahead-of-time and caching the result on local temporary storage
- This will create a new zarr store in local /tmp (or Windows equivalent) if a dataset of the same name does not exist
- Initialized chunks in an existing cache store will be skipped (need to manually delete cache if the same store name now has different data)
Stop normalizing source channels (e.g. phase), where absolute intensity carries physical meaning (#221)

Behavior changes and fixes

Fixed training/validation dataset split, now it's done on the FOV level (used to happen on the sliding window level)
Removed the duplicate architecture parameter in config files
Added a new data.caching parameter (default to false)
Added a new model.log_num_samples parameter (default to 8)
- Sample image are now logged for the first sample of the firstmmodel.log_num_samples batches

Result

After enabling caching, 64 data-loading workers can saturate an A100 GPU with $B \times C \times D \times W \times H = 32 \times 2 \times 5 \times 512 \times 512$ batches. Training on 300 FOVs (80/20 split) now takes 5 min/epoch.

Epoch/hour:

I have not investigated the impact of system RAM on file system caching performance. During the above test a very large amount of RAM ($1536 \times 0.5 = 768$ GB, decompressed dataset is 500 GB) was available for ZFS caching.

mehta-lab / microDL