-
Is the vision_tower you use in the code "clip-vit-large-patch14-336"? I can't load vision_tower when I use the "Gamma-MoD-llava-hr-7b-0.34" you provided. Even if I download "clip-vit-large-patch14-336…
-
# 🌟 New model addition
We recently proposed OFA, a unified model for multimodal pretraining, which achieves multiple SoTAs on downstream tasks, including image captioning, text-to-image generation, r…
-
### Question
## motivation:
I try to use chinese-clip replace clip.
## environment
```bash
$ uname -a
Linux localhost.localdomain 3.10.0-1160.80.1.el7.x86_64 #1 SMP Tue Nov 8 15:48:59 UTC…
-
### Describe the issue
Issue:
When trying to load `liuhaotian/llava-v1.6-mistral-7b` or `liuhaotian/llava-v1.6-34b` into my container:
```
MODEL_PATH = "liuhaotian/llava-v1.6-mistral-7b"
US…
levi updated
4 months ago
-
### Description
Deep Learning Module(s)
### Purpose
Allow using 80% of Deep Learning Modelling via an intuitive and beautiful GUI, just like the JASP's ML module
### Use-case
Someone wh…
-
### Model description
Hi! I'm the author of ["Prismatic VLMs"](https://github.com/TRI-ML/prismatic-vlms), our upcoming ICML paper that introduces and ablates design choices of visually-conditioned …
siddk updated
6 months ago
-
Why the following actions were taken. Is there anything special about cc12m I missed?
https://github.com/OFA-Sys/OFA/blob/a36b91ce86ff105ac8d9e513aa88f42b85e33479/data/pretrain_data/unify_dataset.…
-
is there a quantized version somewhere?
code:
```
import torch
import requests
from PIL import Image
from transformers import Blip2Processor, Blip2ForConditionalGeneration
processor = Blip2…
-
hey amir, nice job with your work , you just forgot the requirements.txt in source, i`m kinda having problem here , it may caused by requirements installed version.
the error :
while loading with…
-
## 論文リンク
https://arxiv.org/abs/2103.00020
## 公開日(yyyy/mm/dd)
2021/01/05
## 概要
OpenAI が発表した DALL·E の中で reranking にも使われていた CLIP (Contrastive Language-Image Pre-training) の論文。
Web 上のテキストから特別な a…