vqa Search Results - Githubissues

1000+ results
for vqa

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

microsoft/LLaVA-Med #25

confusion about the provided model download

Thank you very much for making your code publicly available！But I have a question： llava_med_in_text_60k_ckpt2_delta.zip is the checkpoint of ？ Also， the PMC-Atricles is too large to download， Can y…

WindMarx updated 3 weeks ago
13
e4exp/paper_manager_abstract #427

MDETR -- Modulated Detection for End-to-End Multi-Modal Unde…

- https://arxiv.org/abs/2104.12763 - 2021 マルチモーダル推論システムでは、事前に学習したオブジェクト検出器を用いて、画像から関心領域を抽出します。しかし、この重要なモジュールは、一般的にブラックボックスとして使用されており、下流のタスクとは無関係に、オブジェクトと属性の固定された語彙で訓練されています。そのため、このようなシステムでは、自由…

e4exp updated 3 years ago
3
krasserm/fairseq-image-captioning #9

Use Faster-RCNN directly

Following up on https://github.com/pytorch/fairseq/issues/759#issuecomment-589498214, it would be great if Faster-RCNN could be used directly, so we could input images instead of pre-computed features…

adrelino updated 4 years ago
7
OpenVisualCloud/SVT-VP9 #115

Encoding skipping frames, incorrect PTS/DTS [FFMPEG]

I've been trying to get an encode to work on FFMPEG, either though capture card or file. Haven't figured out what wrong with my settings or if the plugin just doesen't play nice yet. Almost immedi…

bentotom updated 4 years ago
20
THUDM/GLM-4 #281

CUDA running out of memory for a very small dataset (7 sampl…

hello. i had tried the vision fine-tuning script for `glm-4v-9b` model. The command i had used was `python3 finetune_demo/finetune_vision.py ./data THUDM/glm-4v-9b ./finetune_demo/configs/lora.yaml…

theharshithh updated 2 months ago
21
OpenDriveLab/DriveLM #21

Can you provide more information about DriveLM-CARLA?

How to get answers to the questions in the full graph of DriveLM-CARLA？

hutchinsonian updated 2 months ago
3
szzexpoi/POEM #5

About dataset

How to get **novel_vqa_val_known_question.json** and **novel_vqa_val_known_annotation.json**?

ProudZhangXiaolong updated 4 months ago
13
JackAILab/ConsistentID #42

vqa_LLVA and vqa_LLVA_more_face_detail

"When I use LLAVA to generate the corresponding captions, the speed is very slow, taking about one minute to complete the vqa_LLVA and vqa_LLVA_more_face_detail descriptions for a single image."

gaoyixuan111 updated 4 months ago
3
gordenbrown51/damnvid #101

current configure line make ffmpeg unredistributable

``` The configure line used in current ubuntu and debian packages of ffmpeg-damnvid makes the binary unredistributable: ./configure --enable-memalign-hack --enable-libxvid --enable-libx264 --enable…

GoogleCodeExporter updated 9 years ago
5
ItemZheng/KDDAug #4

Why does KDDAug make baseline worse

# Baseline (updn, v2) I run the following code: ```shell CUDA_VISIBLE_DEVICES=0 python main.py --dataset v2 --mode updn --debias none --output v2_updn --seed 0 ``` and get the following log: `…

zhongshsh updated 1 year ago
4

上一页 1...94 95 96 97 98 99 100...100 下一页

1000+ results for vqa

1000+ results
for vqa