-
The 'Fine-tune the Vision Transformer on CIFAR-10 with PyTorch Lightning' tutorial notebook uses a learning rate of lr=5e-5 but should be changed to 2e-5 at the most but probably more like 1e-5. The …
-
### 🐛 Describe the bug
Hello, I've been trying to export the dinov2 vision transformer model to onnx format and have been getting an error:
```
torch.onnx.errors.UnsupportedOperatorError: Exporti…
-
I receive this error when i run this bash command: !bash LWM/scripts/run_sample_video.sh. I have followed all the direction listed in the repo.
```
/usr/local/lib/python3.10/dist-packages/hug…
-
File "/root/.cache/torch/hub/facebookresearch_dino_main/hubconf.py", line 17, in
import vision_transformer as vits
File "/root/.cache/torch/hub/facebookresearch_dino_main/vision_transformer…
-
### 🚀 The feature, motivation and pitch
i.e. instead of this:
https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/openai/serving_chat.py#L138-L140
allow multiple images.
Idea is …
-
Hello
I'm trying to use this method on a vision transformer model(model = torchvision.models.vit_b_16(), first several layers in below image). I read the document, And I think I need to write and use…
-
Hello, by combining the code and your paper, I have the following questions(about vit_ csra):
In the code, the class token is not used in the input of the last CSRA module, so why set the class tok…
-
Hi,
Not an actual issue, just wanted to share that I implemented your technique for Vision Transformers.
https://github.com/jacobgil/vit-explain
This includes some tweaks to get this to work for im…
-
## 🚀 Feature
Please consider adding RoI head for Vision Transformer, which can be used for action detection using Vision Transformer.
## Motivation
Performance of MViT on the AVA dataset is …
-
Hi, I ma working on using vision transformers not only the vanilla ViT, but different models on UMDAA2 data set, this data set has an image resolution of 128*128 would it be better to transform the im…