huggingface / pytorch-image-models

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
https://huggingface.co/docs/timm
Apache License 2.0
32.29k stars 4.76k forks source link

[FEATURE] BEIT pre-training model #1600

Open lorenzbaraldi opened 1 year ago

lorenzbaraldi commented 1 year ago

Is your feature request related to a problem? Please describe. There is no problem or bug

Describe the solution you'd like I would like the implementation of BEIT pre-training pipeline in order to be able to manually pre-training the architecture

Describe alternatives you've considered No

Additional context No

rwightman commented 1 year ago

@lorenzbaraldi MIM (BEiT and MAE style) support is on the todo list but requires some careful experimentation, no sure what the timeline is...

lorenzbaraldi commented 1 year ago

@rwightman since I need to work on that for my research, do you think is it useful to create new training functions (like "train_one_epoch" and validate) that are specific for pre-training phase?