albumentations-team / autoalbument

AutoML for image augmentation. AutoAlbument uses the Faster AutoAugment algorithm to find optimal augmentation policies. Documentation - https://albumentations.ai/docs/autoalbument/
https://albumentations.ai/docs/autoalbument/
MIT License
199 stars 20 forks source link

DistributedDataParallel ? #2

Open learningyan opened 3 years ago

learningyan commented 3 years ago

Hi! Does this codebase support the DistributedDataParallel now? Besides, when I try to search in my dataset, the loss is increasing. The same config format as provided examples, whats's wrong?

creafz commented 3 years ago

Hey, @learningyan

AutoAlbument doesn't support DistributedDataParallel for now, but it is in my roadmap, and I plan to add it in the next few months.

As for loss, now I am creating a benchmark for AutoAlbument on multiple datasets for classification and segmentation. When this benchmark if finished, I can share more intuition behind loss values and their meaning, but for now, here is my experience with loss based on running AutoAlbument on multiple datasets:

jwitos commented 3 years ago

@creafz thanks a lot for this writeup. I'd love to hear more intuition behind losses and assessing the quality of AutoAlbument e.g. to understand at least initially whether the training was successful. In some of my initial experiments d_loss is pretty stable, although the value range is massive (e.g. between e-8 to e+8). Meanwhile, a_loss always decreases or always increases, going into e+9 / e-9 values a few epochs into the training.

creafz commented 3 years ago

Hey, @jwitos

Now I am finishing AutoAlbument experiments with datasets such as CIFAR10, ImageNet, and Pascal VOC. I am planning to add a description of those experiments and loss values to the documentation.

Briefly speaking, I think that the only representative metric for the quality of AutoAlbument training is "Average Parameter change" (that is, how much augmentation parameters changed at the end of the epoch compared to the beginning of the epoch). This metric should decrease and then plateau on some value. But I think that this metric is heavily dependent on the size of a dataset, and if the dataset is small, it can be very noisy.

Here are, for example, Tensorboard logs for one of my CIFAR10 experiments - https://tensorboard.dev/experiment/hpqoQQEATAy9XhpDbvKSKA/#scalars&_smoothingWeight=0. "Average Parameter change" decreasing at the end of the training, while a_loss and d_loss are increasing.

creafz commented 3 years ago

@jwitos I have added TensorBoard logs for AutoAlbument configs from the examples directory. Hope that helps - https://albumentations.ai/docs/autoalbument/metrics/

A few advice I could give:

saigontrade88 commented 7 months ago

@creafz: If I want to extend the base code for multiple gpu processing, where should I start? Also, can you help reupload Tensorboard logs for the CIFAR10, ImageNet, and Pascal VOC since the TensorBoard.dev service has been closed. Many thanks.