Westlake-AI / Awesome-Mixup

[Survey] Awesome List of Mixup Augmentation and Beyond (https://arxiv.org/abs/2409.05202)
https://arxiv.org/abs/2409.05202
Apache License 2.0
131 stars 12 forks source link
awesome-list awesome-mixup computer-vision data-augmentation deep-learning graph image-classification mixup natural-language-processing self-supervised-learning

Awesome-Mixup

Awesome GitHub stars GitHub forks

Welcome to Awesome-Mixup, a carefully curated survey of Mixup algorithms implemented in the PyTorch library, aiming to meet various needs of the research community. Mixup is a kind of methods that focus on alleviating model overfitting and poor generalization. As a "data-centric" way, Mixup can be applied to various training paradigms and data modalities.

If this repository has been helpful to you, please consider giving it a ⭐️ to show your support. Your support helps us reach more researchers and contributes to the growth of this resource. Thank you!

Introduction

We summarize awesome mixup data augmentation methods for visual representation learning in various scenarios from 2018 to 2024.

The list of awesome mixup augmentation methods is summarized in chronological order and is on updating. The main branch is modified according to Awesome-Mixup in OpenMixup and Awesome-Mix, and we are working on a comperhensive survey on mixup augmentations. You can read our survey: A Survey on Mixup Augmentations and Beyond see more detailed information.

Figuer of Contents

You can see the figuer of mixup augmentation methods deirtly that we summarized.

Table of Contents

Table of Contents
    Sample Mixup Policies in SL
    1. Static Linear
    2. Feature-based
    3. Cutting-based
    4. K Samples Mixup
    5. Random Policies
    6. Style-based
    7. Saliency-based
    8. Attention-based
    9. Generating Samples
    Label Mixup Policies in SL
    1. Optimizing Calibration
    2. Area-based
    3. Loss Object
    4. Random Label Policies
    5. Optimizing Mixing Ratio
    6. Generating Label
    7. Attention Score
    8. Saliency Token
    Self-Supervised Learning
    1. Contrastive Learning
    2. Masked Image Modeling
    Semi-Supervised Learning
    1. Semi-Supervised Learning
    CV Downstream Tasks
    1. Regression
    2. Long tail distribution
    3. Segmentation
    4. Object Detection
    Training Paradigms
    1. Federated Learning
    2. Adversarial Attack and Adversarial Training
    3. Domain Adaption
    4. Knowledge Distillation
    5. Multi Modal
    Beyond Vision
    1. NLP
    2. GNN
    3. 3D Point
    4. Other
  1. Analysis and Theorem
  2. Survey
  3. Benchmark
  4. Classification Results on Datasets
  5. Related Datasets Link
  6. Contribution
  7. License
  8. Acknowledgement
  9. Related Project

Sample Mixup Policies in SL

Static Linear

(back to top)

Feature-based

(back to top)

Cutting-based

(back to top)

K Samples Mixup

(back to top)

Random Policies

(back to top)

Style-based

(back to top)

Saliency-based

(back to top)

Attention-based

(back to top)

Generating Samples

(back to top)

Label Mixup Policies in SL

Optimizing Calibration

(back to top)

Area-based

(back to top)

Loss Object

(back to top)

Random Label Policies

(back to top)

Optimizing Mixing Ratio

(back to top)

Generating Label

(back to top)

Attention Score

(back to top)

Saliency Token

(back to top)

Self-Supervised Learning

Contrastive Learning

(back to top)

Masked Image Modeling

(back to top)

Semi-Supervised Learning

(back to top)

CV Downstream Tasks

Regression

(back to top)

Long tail distribution

(back to top)

Segmentation

(back to top)

Object Detection

(back to top)

Other Applications

Training Paradigms

Federated Learning

(back to top)

Adversarial Attack and Adversarial Training

(back to top)

Domain Adaption

(back to top)

Knowledge Distillation

(back to top)

Multi-Modal

(back to top)

Beyond Vision

NLP

(back to top)

GNN

(back to top)

3D Point

(back to top)

Other

(back to top)

Analysis and Theorem

(back to top)

Survey

Benchmark

(back to top)

Classification Results on Datasets

Mixup methods classification results on general datasets: CIFAR10 \ CIFAR100, TinyImageNet, and ImageNet-1K. $(\cdot)$ denotes training epochs based on ResNet18 (R18), ResNet50 (R50), ResNeXt50 (RX50), PreActResNet18 (PreActR18), and Wide-ResNet28 (WRN28-10, WRN28-8).

Method Publish CIFAR10 CIFAR100 CIFAR100 CIFAR100 CIFAR100 CIFAR100 Tiny-ImageNet Tiny-ImageNet ImageNet-1K ImageNet-1K
R18 R18 RX50 PreActR18 WRN28-10 WRN28-8 R18 RX50 R18 R50
MixUp ICLR'2018 96.62(800) 79.12(800) 82.10(800) 78.90(200) 82.50(200) 82.82(400) 63.86(400) 66.36(400) 69.98(100) 77.12(100)
CutMix ICCV'2019 96.68(800) 78.17(800) 78.32(800) 76.80(1200) 83.40(200) 84.45(400) 65.53(400) 66.47(400) 68.95(100) 77.17(100)
Manifold Mixup ICML'2019 96.71(800) 80.35(800) 82.88(800) 79.66(1200) 81.96(1200) 83.24(400) 64.15(400) 67.30(400) 69.98(100) 77.01(100)
FMix arXiv'2020 96.18(800) 79.69(800) 79.02(800) 79.85(200) 82.03(200) 84.21(400) 63.47(400) 65.08(400) 69.96(100) 77.19(100)
SmoothMix CVPRW'2020 96.17(800) 78.69(800) 78.95(800) - - 82.09(400) - - - 77.66(300)
GridMix PR'2020 96.56(800) 78.72(800) 78.90(800) - - 84.24(400) 64.79(400) - - -
ResizeMix arXiv'2020 96.76(800) 80.01(800) 80.35(800) - 85.23(200) 84.87(400) 63.47(400) 65.87(400) 69.50(100) 77.42(100)
SaliencyMix ICLR'2021 96.20(800) 79.12(800) 78.77(800) 80.31(300) 83.44(200) 84.35(400) 64.60(400) 66.55(400) 69.16(100) 77.14(100)
Attentive-CutMix ICASSP'2020 96.63(800)n 78.91(800) 80.54(800) - - 84.34(400) 64.01(400) 66.84(400) - 77.46(100)
Saliency Grafting AAAI'2022 - 80.83(800) 83.10(800) - 84.68(300) - 64.84(600) 67.83(400) - 77.65(100)
PuzzleMix ICML'2020 97.10(800) 81.13(800) 82.85(800) 80.38(1200) 84.05(200) 85.02(400) 65.81(400) 67.83(400) 70.12(100) 77.54(100)
Co-Mix ICLR'2021 97.15(800) 81.17(800) 82.91(800) 80.13(300) - 85.05(400) 65.92(400) 68.02(400) - 77.61(100)
SuperMix CVPR'2021 - - - 79.07(2000) 93.60(600) - - - - 77.60(600)
RecursiveMix NIPS'2022 - 81.36(200) - 80.58(2000) - - - - - 79.20(300)
AutoMix ECCV'2022 97.34(800) 82.04(800) 83.64(800) - - 85.18(400) 67.33(400) 70.72(400) 70.50(100) 77.91(100)
SAMix arXiv'2021 97.50(800) 82.30(800) 84.42(800) - - 85.50(400) 68.89(400) 72.18(400) 70.83(100) 78.06(100)
AlignMixup CVPR'2022 - - - 81.71(2000) - - - - - 78.00(100)
MultiMix NIPS'2023 - - - 81.82(2000) - - - - - 78.81(300)
GuidedMixup AAAI'2023 - - - 81.20(300) 84.02(200) - - - - 77.53(100)
Catch-up Mix AAAI'2023 - 82.10(400) 83.56(400) 82.24(2000) - - 68.84(400) - - 78.71(300)
LGCOAMix TIP'2024 - 82.34(800) 84.11(800) - - - 68.27(400) 73.08(400) - -
AdAutoMix ICLR'2024 97.55(800) 82.32(800) 84.42(800) - - 85.32(400) 69.19(400) 72.89(400) 70.86(100) 78.04(100)

Mixup methods classification results on ImageNet-1K dataset use ViT-based models: DeiT, Swin Transformer (Swin), Pyramid Vision Transformer (PVT), and ConvNext trained 300 epochs.

Method Publish ImageNet-1K ImageNet-1K ImageNet-1K ImageNet-1K ImageNet-1K ImageNet-1K ImageNet-1K
DieT-Tiny DieT-Small DieT-Base Swin-Tiny PVT-Tiny PVT-Small ConvNeXt-Tiny
MixUp ICLR'2018 74.69 77.72 78.98 81.01 75.24 78.69 80.88
CutMix ICCV'2019 74.23 80.13 81.61 81.23 75.53 79.64 81.57
FMix arXiv'2020 74.41 77.37 - 79.60 75.28 78.72 81.04
ResizeMix arXiv'2020 74.79 78.61 80.89 81.36 76.05 79.55 81.64
SaliencyMix ICLR'2021 74.17 79.88 80.72 81.37 75.71 79.69 81.33
Attentive-CutMix ICASSP'2020 74.07 80.32 82.42 81.29 74.98 79.84 81.14
PuzzleMix ICML'2020 73.85 80.45 81.63 81.47 75.48 79.70 81.48
AutoMix ECCV'2022 75.52 80.78 82.18 81.80 76.38 80.64 82.28
SAMix arXiv'2021 75.83 80.94 82.85 81.87 76.60 80.78 82.35
TransMix CVPR'2022 74.56 80.68 82.51 81.80 75.50 80.50 -
TokenMix ECCV'2022 75.31 80.80 82.90 81.60 75.60 - 73.97
TL-Align ICCV'2023 73.20 80.60 82.30 81.40 75.50 80.40 -
SMMix ICCV'2023 75.56 81.10 82.90 81.80 75.60 81.03 -
Mixpro ICLR'2023 73.80 81.30 82.90 82.80 76.70 81.20 -
LUMix ICASSP'2024 - 80.60 80.20 81.70 - - 82.50

(back to top)

Related Datasets Link

Summary of datasets for mixup methods tasks. Link to dataset websites is provided.

Dataset Type Label Task Total data number Link
MINIST Image 10 Classification 70,000 MINIST
Fashion-MNIST Image 10 Classification 70,000 Fashion-MINIST
CIFAR10 Image 10 Classification 60,000 CIFAR10
CIFAR100 Image 100 Classification 60,000 CIFAR100
SVHN Image 10 Classification 630,420 SVHN
GTSRB Image 43 Classification 51,839 GTSRB
STL10 Image 10 Classification 113,000 STL10
Tiny-ImageNet Image 200 Classification 100,000 Tiny-ImageNet
ImageNet-1K Image 1,000 Classification 1,431,167 ImageNet-1K
CUB-200-2011 Image 200 Classification, Object Detection 11,788 CUB-200-2011
FGVC-Aircraft Image 102 Classification 10,200 FGVC-Aircraft
StanfordCars Image 196 Classification 16,185 StanfordCars
Oxford Flowers Image 102 Classification 8,189 Oxford Flowers
Caltech101 Image 101 Classification 9,000 Caltech101
SOP Image 22,634 Classification 120,053 SOP
Food-101 Image 101 Classification 101,000 Food-101
SUN397 Image 899 Classification 130,519 SUN397
iNaturalist Image 5,089 Classification 675,170 iNaturalist
CIFAR-C Image 10,100 Corruption Classification 60,000 CIFAR-C
CIFAR-LT Image 10,100 Long-tail Classification 60,000 CIFAR-LT
ImageNet-1K-C Image 1,000 Corruption Classification 1,431,167 ImageNet-1K-C
ImageNet-A Image 200 Classification 7,500 ImageNet-A
Pascal VOC 102 Image 20 Object Detection 33,043 Pascal VOC 102
MS-COCO Detection Image 91 Object Detection 164,062 MS-COCO Detection
DSprites Image 737,280*6 Disentanglement 737,280 DSprites
Place205 Image 205 Recognition 2,500,000 Place205
Pascal Context Image 459 Segmentation 10,103 Pascal Context
ADE20K Image 3,169 Segmentation 25,210 ADE20K
Cityscapes Image 19 Segmentation 5,000 Cityscapes
StreetHazards Image 12 Segmentation 7,656 StreetHazards
PACS Image 7*4 Domain Classification 9,991 PACS
BRACS Medical Image 7 Classification 4,539 BRACS
BACH Medical Image 4 Classification 400 BACH
CAME-Lyon16 Medical Image 2 Anomaly Detection 360 CAME-Lyon16
Chest X-Ray Medical Image 2 Anomaly Detection 5,856 Chest X-Ray
BCCD Medical Image 4,888 Object Detection 364 BCCD
TJU600 Palm-Vein Image 600 Classification 12,000 TJU600
VERA220 Palm-Vein Image 220 Classification 2,200 VERA220
CoNLL2003 Text 4 Classification 2,302 CoNLL2003
20 Newsgroups Text 20 OOD Detection 20,000 20 Newsgroups
WOS Text 134 OOD Detection 46,985 WOS
SST-2 Text 2 Sentiment Understanding 68,800 SST-2
Cora Graph 7 Node Classification 2,708 Cora
Citeseer Graph 6 Node Classification 3,312 Citeseer
PubMed Graph 3 Node Classification 19,717 PubMed
BlogCatalog Graph 39 Node Classification 10,312 BlogCatalog
Google Commands Speech 30 Classification 65,000 Google Commands
VoxCeleb2 Speech 6,112 Sound Classification 1,000,000+ VoxCeleb2
VCTK Speech 110 Enhancement 44,000 VCTK
ModelNet40 3D Point Cloud 40 Classification 12,311 ModelNet40
ScanObjectNN 3D Point Cloud 15 Classification 15,000 ScanObjectNN
ShapeNet 3D Point Cloud 16 Recognition, Classification 16,880 ShapeNet
KITTI360 3D Point Cloud 80,256 Detection, Segmentation 14,999 KITTI360
UCF101 Video 101 Action Recognition 13,320 UCF101
Kinetics400 Video 400 Action Recognition 260,000 Kinetics400
Airfoil Tabular - Regression 1,503 Airfoil
NO2 Tabular - Regression 500 NO2
Exchange-Rate Timeseries - Regression 7,409 Exchange-Rate
Electricity Timeseries - Regression 26,113 Electricity

(back to top)

Contribution

Feel free to send pull requests to add more links with the following Markdown format. Note that the abbreviation, the code link, and the figure link are optional attributes.

* **TITLE**<br>
*AUTHER*<br>
PUBLISH'YEAR [[Paper](link)] [[Code](link)]
   <details close>
   <summary>ABBREVIATION Framework</summary>
   <p align="center"><img width="90%" src="https://github.com/Westlake-AI/Awesome-Mixup/raw/main/link_to_image" /></p>
   </details>

Citation

If you feel that our work has contributed to your research, please cite it, thanks. 🥰

@article{jin2024survey,
  title={A Survey on Mixup Augmentations and Beyond},
  author={Jin, Xin and Zhu, Hongyu and Li, Siyuan and Wang, Zedong and Liu, Zicheng and Yu, Chang and Qin, Huafeng and Li, Stan Z},
  journal={arXiv preprint arXiv:2409.05202},
  year={2024}
}

Current contributors include: Siyuan Li (@Lupin1998), Xin Jin (@JinXins), Zicheng Liu (@pone7), and Zedong Wang (@Jacky1128). We thank all contributors for Awesome-Mixup!

(back to top)

License

This project is released under the Apache 2.0 license.

Acknowledgement

This repository is built using the OpenMixup library and Awesome README repository.

Related Project