huggingface / pytorch-image-models

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
https://huggingface.co/docs/timm
Apache License 2.0
30.88k stars 4.65k forks source link

VAE or VQ-VAE is needed #2056

Open amirshamaei opened 7 months ago

amirshamaei commented 7 months ago

Is your feature request related to a problem? Please describe. Currently, the timm library lacks implementations for Variational Autoencoder (VAE) and Vector Quantized VAE (VQ-VAE) models. Users looking to utilize these autoencoder architectures may find it inconvenient to implement them from scratch or integrate external implementations into their projects.

Describe the solution you'd like I would like to request the addition of Variational Autoencoder (VAE) and Vector Quantized VAE (VQ-VAE) models to the timm library. This would involve creating modules for these autoencoder architectures, ensuring they adhere to the existing timm standards for simplicity and compatibility.

Describe alternatives you've considered Users can currently implement VAE and VQ-VAE models from scratch or use external implementations from other libraries such as diffusener. However, having native support for these models in the timm library would provide a more streamlined and integrated experience for users.

haideraheem commented 6 months ago

@amirshamaei I would love to contribute to this.