vision-language-transformer Search Results

1000+ results
for vision-language-transformer

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

X-PLUG/mPLUG-Owl #108

Dtype error when loading multilingual model in 4bit

To save GPU memory, I want to load the multilingual model in 4bit mode, the code is as follows. ```python import torch from transformers import AutoTokenizer from mplug_owl.modeling_mplug_owl impo…

ljwdust updated 1 year ago
1
NVIDIA/TensorRT-LLM #2402

Segmentation fault (11) on 1022dev+TRT 10.4.0

### System Info - CPU: x86_64 - CPU mem: 64GB - GPU name: V100 SXM2 16GB and Tesla T4 15GB both happens - Libraries - TensorRT-LLM commit https://github.com/NVIDIA/TensorRT-LLM/tree/3c46c2794e7f6df48…

aliencaocao updated 2 weeks ago
4
huggingface/transformers #17224

ALBEF: Align Before Fuse

### Model description Align Before Fuse (ALBEF) is a vision-language (VL) model that showed competitive results in numerous VL tasks such as image-text retrieval, visual question answering, visual …

ggoggam updated 5 months ago
7
shenyunhang/APE #24

推理失败求助

运行时环境是A100 安装的是pytorch版本是1.12 cuda版本是11.2 执行代码里面的python demo/demo_lazy.py --config-file configs/LVISCOCOCOCOSTUFF_O365_OID_VGR_SA1B_REFCOCO_GQA_PhraseCut_Flickr30k/ape_deta/ape_deta_vitl_eva02_clip_…

Cristhine updated 5 months ago
6
LLaVA-VL/LLaVA-NeXT #46

training code

Hello, I am trying to find the training code, but it seems like there is just inference code. Can you please point to the training code?

ehartford updated 4 months ago
6
dandelin/ViLT #23

FLOPS calculation

hi when you compute the FLOPS in table 6 for baseline models such as ViLBERT, do you also include the FLOPS computation of feature extraction models?

junchen14 updated 3 years ago
1
huggingface/transformers #28936

[i18n-es] Translating docs to Spanish

Hi! Let's bring the documentation to all the Spanish-speaking community 🌐 Who would want to translate? Please follow the 🤗 [TRANSLATING guide](https://github.com/huggingface/transformers/blob/m…

stevhliu updated 4 months ago
8
fulfulggg/Information-gathering #702

ピクセルから文章へ：リモートセンシングのためのマルチモーダル言語モデルの進歩

## タイトル: ピクセルから文章へ：リモートセンシングのためのマルチモーダル言語モデルの進歩 ## リンク: https://arxiv.org/abs/2411.05826 ## 概要: リモートセンシングは、単純な画像取得から、視覚データとテキストデータを統合・処理できる複雑なシステムへと進化しました。本レビューでは、リモートセンシングにおけるマルチモーダル言語モデル（MLLM）の開…

fulfulggg updated 1 week ago
2
salesforce/LAVIS #520

BLIP-2 onnx support

I would like to request support to convert the blip-2 model for onnx conversion. I have tried to convert the model using torch.onnx.export method but there are issues as the input to the forward me…

jethrolow updated 3 months ago
6
huggingface/transformers #20133

Compact Transformer

### Model description # Escaping the Big Data Paradigm with Compact Transformers Abstract : > With the rise of Transformers as the standard for language processing, and their advancements in …

astariul updated 1 year ago
2

上一页 1...7 8 9 10 11 12 13...100 下一页

1000+ results for vision-language-transformer

1000+ results
for vision-language-transformer