-
Hi,
I am encountering an issue when running inference on the Llama-3-VILA1.5-8B model. The error message I receive is:
```RuntimeError: FlashAttention only supports Ampere GPUs or newer.```
I…
-
**code:**
query = 'What does the picture show?'
image_paths = ['/home/downloads/test.jpg']
huatuogpt_vision_model_path = "/home/llm_models/HuatuoGPT-Vision-7B"
from cli import HuatuoChatbot
b…
-
### 🚀 The feature
Implement CrossVIT model for Fine grained classification
### Motivation, pitch
CrossViT integrates multi-scale feature representations, enabling it to efficiently process images o…
-
When I run
`bash scripts/video/demo/video_demo.sh ${the path of LLaVA-NeXT-Video-7B-DPO} vicuna_v1 32 2 True ${the path of video}`
I get the error
```
Can't set vocab_size with value 32000 for …
-
features = self.dino_block.forward_features(x.to("cuda"))['x_norm_patchtokens']
File "/root/.cache/torch/hub/facebookresearch_dinov2_main/dinov2/models/vision_transformer.py", line 258, in forward_…
-
Hi @rosinality, hope you are doing well!
I really like your repo, especially for dataloader and augmentation part for image classification.
I am not majorly working on Vision field but still I have …
-
Hi~,I am recently trying to use the llava_onevision model, I try to follow the onevision tutorial, which seems pretty easy. I run the program exactly as the tutorial, the model is 0.5b_si. However, a …
-
In train.py the argument
```python
parser.add_argument('--tuning-mode', default=None, type=str,
help='Method of fine-tuning (default: None')
```
is later passed to `create_mo…
-
I encountered an issue when trying to use Vision Transformer based models like _vit_base_, _vit_swin_large_, etc. in the PatchCore implementation. I tried to execute this on the Kaggle Notebook enviro…
-
I had a question regarding LoRA support for image classification and segmentation. I understand that LoRA support is available for both as specified in the following tutorials:
https://github.com/hug…