DIAGNijmegen / pathology-whole-slide-data

A package for working with whole-slide data including a fast batch iterator that can be used to train deep learning models.
https://diagnijmegen.github.io/pathology-whole-slide-data/
Apache License 2.0
86 stars 24 forks source link

Extracting randomly rotated patches from the WSI. #22

Closed michelbotros closed 2 years ago

michelbotros commented 2 years ago

Hi Mart,

I am currently experimenting with data augmentations for training a segmentation model. These include rotating the patches that are retrieved with the batch iterator from this framework. After the rotation there is often a missing region that has to be filled, but this context is available in the WSI. Is it maybe an idea to add an callback for this specifically for training: sample randomly rotated patches.

It could be implemented by extracting a slightly bigger patch (to get all context necessary), rotating it with a random angle between -180, 180 and then center cropping to a desired output, for both the image and the mask.

I think rotation augmentations are nice to have in digital pathology applications and might be best to implement during the patch extraction from the WSI.

P.S. I'm really curious what your experience is with applying data augmentations for the HookNet paper.

Best,

Michel

martvanrijthoven commented 2 years ago

Dear Michel,

In the batch iterator, you can use any albumentations augmentation. This augmentation callback was implemented by @thijsgelton via batch callbacks. Here is an example config file that includes the configuration and shows how you can use callbacks and in specific the albumentations callback: https://github.com/DIAGNijmegen/pathology-whole-slide-data/blob/main/tests/test_files/user_config.yml

You can also create a Batch or Sample callback and use custom data augmentations. For example, you can subclass SampleCallback and use your own subclass. All patches and labels will be passed through if you add your callback in your config file.

You can use random angle rotation in a custom SampleCallback and crop via the FitOutput sample callback. Please note that this is a sample callback, so the rotation should also be a sample callback and should be specified before the cropping callback. There is no batch cropping callback, but I will implement that soon.

For training HookNet, we used spatial, color, noise, and stain augmentations. Due to low-resolution and high-resolution patches, you will have to be careful in applying distortive augmentations because the effect will be more severe in the low-resolution patch and, therefore can misalign the patches.

I am working on documentation, but this is not done yet, so please let me know if anything is unclear.

Best wishes, Mart

michelbotros commented 2 years ago

Hi Mart,

Great, there is options already! I'll try those. I think I should manage with the information provided. Good to hear that you're working on documentation. I'll let you know if I have any further issues or questions.

Thanks,

Michel

yuling-luo commented 8 months ago

Dear Michel,

In the batch iterator, you can use any albumentations augmentation. This augmentation callback was implemented by @thijsgelton via batch callbacks. Here is an example config file that includes the configuration and shows how you can use callbacks and in specific the albumentations callback: https://github.com/DIAGNijmegen/pathology-whole-slide-data/blob/main/tests/test_files/user_config.yml

You can also create a Batch or Sample callback and use custom data augmentations. For example, you can subclass SampleCallback and use your own subclass. All patches and labels will be passed through if you add your callback in your config file.

You can use random angle rotation in a custom SampleCallback and crop via the FitOutput sample callback. Please note that this is a sample callback, so the rotation should also be a sample callback and should be specified before the cropping callback. There is no batch cropping callback, but I will implement that soon.

For training HookNet, we used spatial, color, noise, and stain augmentations. Due to low-resolution and high-resolution patches, you will have to be careful in applying distortive augmentations because the effect will be more severe in the low-resolution patch and, therefore can misalign the patches.

I am working on documentation, but this is not done yet, so please let me know if anything is unclear.

Best wishes, Mart

Hi Mart, I'm also in the process of applying augmentation technique. The usr_config you shared in this reply can not be found anymore. Would it be possible that you could share a new one? Thanks!

martvanrijthoven commented 8 months ago

Dear @yuling-luo

You can add something like this to your user config:

batch_callbacks:
      - "*object": wholeslidedata.interoperability.albumentations.callbacks.AlbumentationsSegmentationBatchCallback
        augmentations:
          - RandomRotate90:
              p: 0.5
          - Flip:
              p: 0.5
          - RandomSizedCrop:
              p: 1
              min_max_height: [ 100, 200 ]
              height: 284
              width: 284
          - ElasticTransform:
              p: 0.5
              alpha: 45
              sigma: 6
              alpha_affine: 4
          - HueSaturationValue:
              hue_shift_limit: 0.2
              sat_shift_limit: 0.3
              val_shift_limit: 0.2
              p: 0.5
          - GridDistortion:
              p: 1.0
          - RandomBrightnessContrast:
              p: 0.4

Please note that you will need albumentations==1.2.1, newer version of albumentations wont work (will have to update the callback to make it compatible with newer versions.)

Let me know if you have any trouble