-
Hi,
I have worked out using the different data loaders to compare images and video but cannot work out a way to do both simultaneously. It seems using ModalityType.VISION, you can only load and t…
-
#
[sound-spaces](https://github.com/facebookresearch/sound-spaces)
[Project: RLR-Audio-Propagation](https://github.com/facebookresearch/rlr-audio-propagation)
[Audio Sensor](https://github.com/f…
yyf17 updated
2 years ago
-
## Problem statement
1. CLIP variants의 이미지와 텍스트 사이의 관계 학습은 텍스트의 각 토큰들과 이미지 패치의 관계에 대해 학습하기에는 학습과 추론 시 효율성이 떨어진다 -> finer-level alignment할 수 있는 방법을 찾아보자
2. 이미지 패치와 텍스트 토큰 간의 attention 이용하는 기존 연구의 약점 …
-
Hello!
Thank you for your great work, I have following question :
I have my own Elyza7B checkpoint that I want to finetune on VQA task. If I follow the llava training scheme closely, I think we n…
-
# 🌟 New model addition
We recently proposed OFA, a unified model for multimodal pretraining, which achieves multiple SoTAs on downstream tasks, including image captioning, text-to-image generation, r…
-
### Question
I have two questions.
1. I follow the instruction in scripts/v1.5 to pre-train and fine-tune the model. After pre-training, I get the mm_projector.bin; and after fine-tuning I get adap…
-
# prompt
Calibrate Before Use: Improving Few-Shot Performance of Language Models (https://arxiv.org/abs/2102.09690)
p-tuning (https://arxiv.org/abs/2104.08691)
Do Prompt-Based Models Really Underst…
-
## 创新更多发生在科研网络的边缘
* paper: Innovations are disproportionately likely in the periphery of a scientific network [[paper link](https://link.springer.com/article/10.1007%2Fs12064-021-00359-1)]
* 中文阅读:…
-
Hi, could you please tell me that how the DeepRT+ do the calibration using a certain ratio of the test-group peptides after pretraining with the big data?
Thanks.
-
We need to convert keras.io examples to work with Keras 3.
This involves two stages:
## Stage 1: tf.keras backwards compatibility check
Keras 3 is intended as a drop-in replacement for tf.ker…