-
# Describe the feature
**Motivation**
There is no implementaiton of SwinV2 for semantic segmentation
**Related resources**
The original implementation is only for Image Classification.
**Addi…
rznas updated
2 years ago
-
### Model description
**The corresponding paper has been accepted by International Journal of Computer Vision (IJCV).**
We present a novel masked image modeling (MIM) approach, context autoencoder…
-
Hello, I have read your paper 《Contrastive Learning Rivals Masked Image Modeling in Fine-tuning via Feature Distillation 》and find it interesting. I noticed that you said in the abstract
> The code w…
-
Currently I'm trying to adapt the [tutorial code](https://github.com/NielsRogge/Transformers-Tutorials/blob/master/LayoutLMv3/Fine_tune_LayoutLMv3_on_FUNSD_(HuggingFace_Trainer).ipynb) for LayoutLMv3 …
-
hello, for CLIP knowledge distilation paper, i.e.,A Unified View of Masked Image Modeling:
when the teacher is CLIP vit-large/14 for 196's input resolution, and the student is vit-base/16 for 224's i…
-
1.PARE: Part Attention Regressor for 3D Human Body Estimation(2021)
img-->volumetric features(before the global average pooling)-->part branch: estimates attention weights +feature branch: performs S…
-
Masked pixels arise from cosmic ray hits, artifact removal, saturation ...
The question we ask here is: do we prefer to have masked pixels interpolated upstream or do we fix them by modeling in sca…
-
# Vision Transformer Adapter for Dense Predictions
Info.
- ICLR 2023 spotlight
- https://github.com/czczup/ViT-Adapter
- https://arxiv.org/abs/2205.08534
### Summary
- plain ViT
- whi…
-
## Selfie: Self-supervised Pretraining for Image Embedding
- [https://arxiv.org/abs/1906.02940](https://arxiv.org/abs/1906.02940)
- Google Brain team
- Generalizes Masked Language Modeling (MLM) …
-