vision-language-pretraining Search Results

197 results
for vision-language-pretraining

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

mertyg/vision-language-models-are-bows #16

Requirements (e.g. torch versions)

Hi, I tried running the XVLM model (via `get_model(model_name="xvlm-coco", device="cuda", root_dir="./tmp/")` in the notebook). I get the error ``` CUDA error: CUBLAS_STATUS_EXECUTION_FAILED whe…

DianeBouchacourt updated 1 year ago
4
alibaba/AliceMind #73

Fairness of SOTA comparison in mPLUG-2

Hi! I have read the paper about mPLUG-2, it's really a great vision-language foundation model with a fantastic design. **However, I have some doubts about the fairness of the SOTA comparison:** Ac…

Andy1621 updated 1 year ago
2
LinWeizheDragon/Retrieval-Augmented-Visual-Question-Answering #7

About image features

Hello! I am wondering if you have processed image features for this task before. And do you know what about the model's performance with image features? Thank you very much!

yao-jz updated 1 year ago
10
amazon-science/semimtr-text-recognition #12

semimtr_finetune without language model use

is it possible to fineture semimtr without trained language model use? if possible please let me know how? Thank you

WongVi updated 1 year ago
2
AtsukiOsanai/cv_survey #97

Language Matters: A Weakly Supervised Vision-Language Pre-tr…

# Language Matters: A Weakly Supervised Vision-Language Pre-training Approach for Scene Text Detection and Spotting ## Information - Authors: Chuhui Xue+ - Organization: Nanyang Technological Uni…

AtsukiOsanai updated 1 year ago
1
OFA-Sys/OFA #303

How to control data samples in a batch

Hi! Thank you for your amazing work. I have a question about how to control the number of data samples in a batch. According to your paper, you said that We mix all the pretraining data within…

taokz updated 1 year ago
1
salesforce/ULIP #16

There may be a bug in data/dataset_3d.py

I think there is a bug in `data/dataset_3d.py` when tokenizing the prompts in the `data/templates.json` The specific point is shown as below, which is cited from the author's implementation. htt…

auniquesun updated 1 year ago
5
Significant-Gravitas/AutoGPT #346

How about let AutoGPT to access a virtual machine like Virtu…

### Duplicates - [X] I have searched the existing issues ### Summary 💡 1. Attach to a VirtualBox instance, give AI a default OS like ubuntu 2. if AI decide to use computer: enter "screenshot-mouse…

artheru updated 1 year ago
26
NorbertZheng/read-papers #51

Sik-Ho Tsang | Review: Vision Transformer (ViT).

Sik-Ho Tsang. [Review: Vision Transformer (ViT)](https://sh-tsang.medium.com/review-vision-transformer-vit-406568603de0). Dosovitskiy A, Beyer L, Kolesnikov A, et al. [An image is worth 16x16 words: …

NorbertZheng updated 1 year ago
11
OFA-Sys/OFA #360

KeyError: 'ema' during inference on VQA

Hi, I pretrained OFA-tiny on my private a tsv file in the form of only VQA (or a tsv file including only caption). For example, `1 000002b66c9c498e what is the danger for an object in the given ima…

jun297 updated 1 year ago
1

上一页 1...12 13 14 15 16 17 18...20 下一页

197 results for vision-language-pretraining

197 results
for vision-language-pretraining