-
Thank you very much for making your code publicly available!But I have a question:
llava_med_in_text_60k_ckpt2_delta.zip is the checkpoint of ?
Also, the PMC-Atricles is too large to download, Can y…
-
- https://arxiv.org/abs/2104.12763
- 2021
マルチモーダル推論システムでは、事前に学習したオブジェクト検出器を用いて、画像から関心領域を抽出します。
しかし、この重要なモジュールは、一般的にブラックボックスとして使用されており、下流のタスクとは無関係に、オブジェクトと属性の固定された語彙で訓練されています。
そのため、このようなシステムでは、自由…
e4exp updated
3 years ago
-
Following up on https://github.com/pytorch/fairseq/issues/759#issuecomment-589498214, it would be great if Faster-RCNN could be used directly, so we could input images instead of pre-computed features…
-
I've been trying to get an encode to work on FFMPEG, either though capture card or file.
Haven't figured out what wrong with my settings or if the plugin just doesen't play nice yet.
Almost immedi…
-
hello.
i had tried the vision fine-tuning script for `glm-4v-9b` model. The command i had used was `python3 finetune_demo/finetune_vision.py ./data THUDM/glm-4v-9b ./finetune_demo/configs/lora.yaml…
-
How to get answers to the questions in the full graph of DriveLM-CARLA?
-
How to get **novel_vqa_val_known_question.json** and **novel_vqa_val_known_annotation.json**?
-
"When I use LLAVA to generate the corresponding captions, the speed is very slow, taking about one minute to complete the vqa_LLVA and vqa_LLVA_more_face_detail descriptions for a single image."
-
```
The configure line used in current ubuntu and debian packages of ffmpeg-damnvid
makes the binary unredistributable:
./configure --enable-memalign-hack --enable-libxvid --enable-libx264
--enable…
-
# Baseline (updn, v2)
I run the following code:
```shell
CUDA_VISIBLE_DEVICES=0 python main.py --dataset v2 --mode updn --debias none --output v2_updn --seed 0
```
and get the following log:
`…