-
Do you have any plans to support multimodal LLMs, such as MiniGPT-4/MiniGPT v2 (https://github.com/Vision-CAIR/MiniGPT-4/) and LLaVA (https://github.com/haotian-liu/LLaVA/)? That would be a significan…
-
Hello,
Could you provide the vision transformer backbone used for the model?
I am using dino's vision_transformer.py code for a vit -giant (https://github.com/facebookresearch/dino/blob/main/visi…
-
如题,谢谢
-
Hello,
I am currently working on a project that involves the application of the KAdaptation technique, as detailed in the paper "Parameter-efficient Model Adaptation for Vision Transformers" , to v…
-
### System Info
Hello TensorRT-LLM team! 👋 I'm facing an issue where the inference output does not contain the expected "Singapore" text. Below are the details of my setup and steps to reproduce the …
-
Hi! !مرحبا! السلام عليكم
Let's bring the documentation to all the Arabic-speaking community 🌏 (currently 0 out of 267 complete)
Would you want to translate? Please follow the 🤗 [TRANSLATING guid…
-
Would it be possible to add functionality for **Grad-CAM** or **attention map** similar to those used in DINO?
Thank you!
-
ModuleNotFoundError: No module named 'models.vision_transformer'
-
### Model description
[jinaai/jina-clip-v1](https://huggingface.co/jinaai/jina-clip-v1/tree/main/onnx)
### Prerequisites
- [X] The model is supported in Transformers (i.e., listed [here](https://hu…
do-me updated
1 month ago
-
### Model description
I know the transformers library has not included object tracking models in the past, but this one can either plug into any object detection model or be an end-to-end open world …