-
# 💻 cs
## 📚 mask (total: 9)
### 📃 Deep Pneumonia: Attention-Based Contrastive Learning for Class-Imbalanced Pneumonia Lesion Recognition in Chest X-rays
- **Authors:** Xinxu Wei, Haohan Bai, Xianshi …
-
If I want to work with multimodal LLMs that takes in a set of embedding from vision/audio encoders, what is the proper way of inputting them into a LLM running using exllamav2?
Can I just add a custo…
-
Thank you for releasing the codes and providing an in-depth analysis in the paper.
I have the following two questions when reproducing the attack codes on the model `blip2` in `LAVIS_tool`.
1.…
-
cur_image_features = image_features[cur_image_idx]
~~~~~~~~~~~~~~^^^^^^^^^^^^^^^
IndexError: index 1 is out of bounds for dimension 0 with size 1
Is there any reason …
-
Dear authors,
First, congratulations to your great work, which we think a valuable resource for evaluating the hallucination of VLMs. We have implemented HallusionBench in [VLMEvalKit](https://githu…
-
### 是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
- [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions
### 该问题是否在FAQ中有解答? | Is there an existing…
-
Hey there, I am interested in running VQAScore with another VLM, CogVLM (see [here](https://huggingface.co/THUDM/cogvlm-chat-hf)). I was looking at the guidelines on how to adapt to another VQA model …
-
### Your current environment
H100 40GB
### Model Input Dumps
_No response_
### 🐛 Describe the bug
```
docker run -d --restart=always \
--runtime=nvidia \
--gpus '"device=MIG-2ea01c20-8…
-
I'm using the `LanguageModel` class to wrap a vision-language model LLaVA, and during the execution of
```python
with tracer.invoke(inputs)
```
[`nnsight/contexts/Invoker.py#L55`](https://github.…
-
Hello, thank you for your great work!
We are currently exploring the utilization of radio as a vision encoder for vision language models. In our specific setup, we employ [SigClip](https://huggingfac…