-
Traceback (most recent call last):
File "/opt/tiger/internlm-xcomposer/finetune/finetune.py", line 311, in
train()
File "/opt/tiger/internlm-xcomposer/finetune/finetune.py", line 242, in t…
-
hey amir, nice job with your work , you just forgot the requirements.txt in source, i`m kinda having problem here , it may caused by requirements installed version.
the error :
while loading with…
-
### Description
The [transformer-based image classification model](https://arxiv.org/abs/2010.11929) is becoming popular. It will be nice to include it in this repo.
### Expected behavior with the…
-
### Links
- Paper : https://arxiv.org/abs/2107.04589
- Github : -
### 한 줄 요약
- ViT를 GAN에 적용해 CNN-based GAN에 견줄만한 성능을 낸다.
### 선택 이유
- CNN-based GAN이 아닌 ViT를 GAN에 처음 적용한 논문이라 흥미로워서 읽어보았다.
- 또…
-
왜 ViT 가 잘 working 할까에 대해 연구한 논문.
[paper](https://arxiv.org/abs/2202.06709)
일반적으로 생각하는 MSA 가 좋은 이유
```
MSA 의 어떤 부분이 모델을 위해 좋을까?
==> long range dependency
MSA가 conv 처럼 동작할까?
==> MSA 가 general…
-
https://github.com/kijai/ComfyUI-moondream/blob/b97ad4718821d7cee5eacce139c94c9de51268b8/moondream/vision_encoder.py#L37
new revision might have a different shape: `2304`
```
Traceback (most re…
-
Hi all,
Thank you for the great contribution. In _B-cos Alignment for Inherently Interpretable CNNs and Vision Transformers_, TABLE 1 shows the classification accuracy of B-cos models with MaxOut a…
-
### The model to consider.
The llava-next-video project has already been released, and the test results are quite good. Are there any plans to support this project?
`https://github.com/LLaVA-VL/LLaV…
-
### Model description
Dear huggingface team,
The fair team published an improved version of dinov2 [VISION TRANSFORMERS NEED REGISTERS](https://arxiv.org/abs/2309.16588). The models and checkpoi…
-
- https://arxiv.org/abs/2104.14294
- 2021
本論文では,自己教師付き学習がVision Transformer (ViT)に,畳み込みネットワーク(convnets)と比較して際立った新しい特性を与えるかどうかを疑問視している.
自己教師付き手法をこのアーキテクチャに適応させると、特にうまくいくという事実に加えて、次のような見解を得た:
第1…
e4exp updated
3 years ago