albumentations-team / albumentations

Fast and flexible image augmentation library. Paper about the library: https://www.mdpi.com/2078-2489/11/2/125
https://albumentations.ai
MIT License
14.13k stars 1.64k forks source link

Supported mask formats with Albumentations #1973

Open PRFina opened 1 week ago

PRFina commented 1 week ago

Your Question

From the documentation, both API reference and [user guide] (https://albumentations.ai/docs/getting_started/mask_augmentation/) sections, it's not straightforward to understand which kind of mask format is supported and more importantly, if different mask formats can lead to different transformation outputs due to some internal implementation details. Take for example a semantic segmentation task with 3 classes: A, B, and C, each class has an associated mask Ma, Mb, Mc stored as a different file. Besides RLE encoding and similar sparse formats, the most basic ways to encode a dense mask, and augment a sample are:

Now, my questions are:

ternaus commented 1 week ago

Albumentations supports all 3. Performance is similar, results will be the same. Same transform are used under the hood. The difference is only in forth and back format conversion.

Thank you for the question, will update the docs.