vision-language-pretraining Search Results

196 results
for vision-language-pretraining

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

facebookresearch/ImageBind #81

Splitting VISION into text and video

Hi, I have worked out using the different data loaders to compare images and video but cannot work out a way to do both simultaneously. It seems using ModalityType.VISION, you can only load and t…

kilowrk updated 1 year ago
1
yyf17/awesome-embodied-intelligent #1

SoundSpace

# [sound-spaces](https://github.com/facebookresearch/sound-spaces) [Project: RLR-Audio-Propagation](https://github.com/facebookresearch/rlr-audio-propagation) [Audio Sensor](https://github.com/f…

yyf17 updated 2 years ago
1
bigshanedogg/survey #18

[FILIP] FILIP: Fine-grained Interactive Language-Image Pre-T…

## Problem statement 1. CLIP variants의 이미지와 텍스트 사이의 관계 학습은 텍스트의 각 토큰들과 이미지 패치의 관계에 대해 학습하기에는 학습과 추론 시 효율성이 떨어진다 -> finer-level alignment할 수 있는 방법을 찾아보자 2. 이미지 패치와 텍스트 토큰 간의 attention 이용하는 기존 연구의 약점 …

bigshanedogg updated 2 years ago
1
turingmotors/heron #34

Regarding training your own model

Hello! Thank you for your great work, I have following question : I have my own Elyza7B checkpoint that I want to finetune on VQA task. If I follow the llava training scheme closely, I think we n…

Aniketto16 updated 8 months ago
1
huggingface/transformers #15813

Add OFA to transformers

# 🌟 New model addition We recently proposed OFA, a unified model for multimodal pretraining, which achieves multiple SoTAs on downstream tasks, including image captioning, text-to-image generation, r…

JustinLin610 updated 1 year ago
4
haotian-liu/LLaVA #808

[Question] How to use the fine-tuned model？

### Question I have two questions. 1. I follow the instruction in scripts/v1.5 to pre-train and fine-tune the model. After pre-training, I get the mm_projector.bin; and after fine-tuning I get adap…

ZY123-GOOD updated 11 months ago
4
dhkim0225/1day_1paper #15

TODO LIST

# prompt Calibrate Before Use: Improving Few-Shot Performance of Language Models (https://arxiv.org/abs/2102.09690) p-tuning (https://arxiv.org/abs/2104.08691) Do Prompt-Based Models Really Underst…

dhkim0225 updated 2 years ago
1
chaos-moon/paper_daily #23

正经科研讯息 [长期更新]

## 创新更多发生在科研网络的边缘 * paper: Innovations are disproportionately likely in the periphery of a scientific network [[paper link](https://link.springer.com/article/10.1007%2Fs12064-021-00359-1)] * 中文阅读:…

yaoyz96 updated 1 year ago
6
horsepurve/DeepRTplus #13

About the DeepRT+ calibration

Hi, could you please tell me that how the DeepRT+ do the calibration using a certain ratio of the test-group peptides after pretraining with the big data? Thanks.

Chaohk updated 2 years ago
4
keras-team/keras #18468

Keras.io examples conversion gameplan

We need to convert keras.io examples to work with Keras 3. This involves two stages: ## Stage 1: tf.keras backwards compatibility check Keras 3 is intended as a drop-in replacement for tf.ker…

fchollet updated 8 months ago
21

上一页 1...1 2 3 4 5 6 7...20 下一页

196 results for vision-language-pretraining

196 results
for vision-language-pretraining