-
Hi, Teams,
Thanks so much for this great work! I am currently trying to add more trained VL models to this repo. And I have a question about the Dataset Type used in evaluation. In `textvqa.py`, the…
-
## タイトル: ビジョン言語モデル時代における汎化された外部分布検出とその先に:サーベイ
## リンク: https://arxiv.org/abs/2407.21794
## 概要:
機械学習システムの安全性を確保するために、外部分布(OOD)サンプルの検出は重要であり、OOD検出の分野を形作ってきました。一方、異常検出(AD)、新奇検出(ND)、オープンセット認識(OSR)、および外…
-
### Motivation
Since v0.4.2, `torchvision` has been introduced into the LMDeploy runtime because it is used in VLMs.
https://github.com/InternLM/lmdeploy/blob/54b7230b4ca08b37b85e5f6e1960e2445dca52…
-
Hi,
Thank you for your contributions. Your idea for a vision-language pathology model is quite interesting.
However, I noticed that there is another paper presented at CVPR 2024 on a slide-level…
-
File "/root/.local/lib/python3.12/site-packages/vlmeval/dataset/image_mcq.py", line 181, in evaluate
answer_map = {i: c for i, c in zip(meta['index'], meta['answer'])}
…
-
Is there a plan to release a normalized 4-bit version?
lihan updated
5 months ago
-
Hi,
Thanks for the great work! For my current project, I am looking to use the sample-wise evaluation results of VLMs for the experiments you have conducted.
If you can provide me with the sample…
-
```python
def normalize(images):
mean = torch.tensor([0.48145466, 0.4578275, 0.40821073]).cuda()
std = torch.tensor([0.26862954, 0.26130258, 0.27577711]).cuda()
images = images - mean[…
-
**As a** Data Scientist,
**I want to** document the application of LabelFix and its comparison with Vision Language Models (VLMs) for error detection in classification tasks,
**So that** I can validat…
-
I found some VLMs are too sensitive to prompt. For example, when I use **mlx-community/llava-1.5-7b-4bit**:
the image is:
![image](https://github.com/Blaizzy/mlx-vlm/assets/72635723/1ab52f9b-085a-47…
cmgzy updated
5 months ago