huggingface / pytorch-image-models

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
https://huggingface.co/docs/timm
Apache License 2.0
32.24k stars 4.76k forks source link

Add GENet, a new GPU-Efficient Network #242

Closed yohann84L closed 3 years ago

yohann84L commented 4 years ago

GENet look promising in the paper. Better or similar results as EfficientNet/MobileNet with faster inference time. Pytorch implementation: https://github.com/idstcv/GPU-Efficient-Networks It composed of 3 networks:

Paper here: https://arxiv.org/abs/2006.14090

Sotabench: https://sotabench.com/user/EvgeniiZh/repos/Randl/GPU-Efficient-Networks For example: GENet_large does similar result to EfficientNet B1 (NoisyStudent) but is 1.9x faster.

Is it in your plan to implement it? I'm maybe thinking to give it a try if I got some time.

Tell me if I'm wrong but It could also be a good backbone for EfficientDet ?

Thanks again @rwightman for all the work you've done!

rwightman commented 4 years ago

@yohann84L I noticed this network, was interested but the current ref impl is fairly unusual in the impl, https://github.com/idstcv/GPU-Efficient-Networks/blob/master/GENet/GENet_small.txt ... fully defined by strings with model block and factory code intertwined https://github.com/idstcv/GPU-Efficient-Networks/blob/master/GENet/__init__.py

It'll be a bit more work than usual to adapt these models into a more familiar form that allows me to add support for the feature extraction, etc .. open to help, but will possibly tackle it sometime in the not too distant future

ternaus commented 4 years ago

+1 on this request.

hal-314 commented 4 years ago

GENet seems promising. I'm curious to see a mix between GENet and TResnets. Both are GPU-Efficient networks and some of their improvements are orthogonal.

rwightman commented 3 years ago

@yohann84L @ternaus I got around to at least trying these models out today... the model GPU throughputs are impressive for the accuracy numbers... default weights have crap OOD scores. Makes their training pipeline a bit suspect. No details yet on how they were trained

The TResNet-M here is similar top-1 @ 256, fast but not quite as fast, better OOD for included weights

I'll get to this eventually but their model defs are massive PITA to work through so it's not at the top of my list

rwightman commented 3 years ago

It's in #419 ... but FYI, these are essentially ResNet/RegNets but with different blocks per stage, and a bit of monkeying with DW grouping and expansion vs bottleneck for some blocks.

yohann84L commented 3 years ago

Thanks a lot @rwightman for all this work !

rwightman commented 3 years ago

Merged