-
Hi Hugo,
Thanks for this sharing your amazing work. I was trying to train ViT-B/16 from scratch on ImageNet-1k using the hyperparams reported in your DeIT paper. I'm pretty sure I'm missing something…
-
Hi
I am using your code to train ViT from scratch. I do not know why you use BCE as the loss function of the upstream job and CE as the loss function of the downstream task. Is this the origin…
-
Could you please share the hyperparameters that yield the best results for models trained from scratch on Resisc45?
Also you state in the paper
> We perform a thorough search for a good trainin…
Riksi updated
3 years ago
-
Just curious - did you try training the same model architecture end-to-end from scratch (i.e. not distilling from VITS), and if so, are there any audio comparison samples available?
-
Hey! First of all, thanks for your contribution! I have looked at multiple ViT implementations and yours seems like the most straightforward, well-organized and simple to use.
I'd like to use your…
-
## Reference
- 2021-01 **[T2T-ViT]** Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet [[Paper](https://arxiv.org/abs/2101.11986)] [[Code](https://github.com/yitu-opensource/…
-
Code snippet that is causing the error:
```
def compile_torch_to_mhlo(model,data):
print('Compile torch program to mhlo test\n------\n')
import torch_mlir
module = torch_mlir.comp…
-
### Question
I want to try changing `liuhaotian/llava-v1.5-13b` to use a different image tower instead of `clip-vit-large-patch14`.
1. After changing the vision tower, is it necessary to pretrain …
-
I am using vit to train ImageNet 1k from scratch. The accuracy of SOTA is about 70% to 80%. But I can only reach 30%. I don't know why it doesn't work. I use the following configuration.
``` python…
-
Hi! Do I understand correctly that you were using pretrained models? If not, I don't see it in your code.