jackroos / VL-BERT

Code for ICLR 2020 paper "VL-BERT: Pre-training of Generic Visual-Linguistic Representations".
MIT License
738 stars 110 forks source link

Can the VL-BERT model be applied to metric learning? #62

Closed zhoulukuan closed 3 years ago

zhoulukuan commented 4 years ago

I want to apply the algorithm to a multi-modal problem(VL data). Since the data is in the form of pairs and it is difficult to decide the categories of samples, I think metric learning may be more suitable. How to apply VL-BERT in this situation?

jackroos commented 3 years ago

What do you mean by saying "apply VL-BERT to metric learning"? Did you mean fine-tuning pre-trained VL-BERT on some VL metric learning task, or pre-train VL-BERT using metric learning methods?

zhoulukuan commented 3 years ago

What do you mean by saying "apply VL-BERT to metric learning"? Did you mean fine-tuning pre-trained VL-BERT on some VL metric learning task, or pre-train VL-BERT using metric learning methods?

Fine-tuning pre-trained VL-BERT on some VL metric learning task. Because the data is relatively small, I think the pre-trained model can provide some useful information. But I am not sure if this can be applied to metric learning task.

jackroos commented 3 years ago

Yes. VL-BERT can be applied to any task that input format is a pair of image and text.