-
Notice that the results in paper 'Deep Cross-Modal Pojection Learning for Image-Text Matching' are:{top- 1 = 49.37%,top-10 = 79.27%}
while the results in this project are {top- 1 = 42.999%,top-10 = 6…
-
I came across an article titled [Learning placeholders for open-set recognition](http://openaccess.thecvf.com/content/CVPR2021/html/Zhou_Learning_Placeholders_for_Open-Set_Recognition_CVPR_2021_paper.…
-
# 🌟 New model addition
We recently proposed OFA, a unified model for multimodal pretraining, which achieves multiple SoTAs on downstream tasks, including image captioning, text-to-image generation, r…
-
Hi,
Thank you for your great work.
While I want to apply this method on my custom data, I found this line in ucf101.py.
Why do you multiply 10 here?
Thank you
-
-
- Código: https://github.com/wtnthu/FaCoR
- Paper: https://arxiv.org/pdf/2304.04546.pdf
Chamou-me a atenção por estar entre os melhores resultados na comparação feita na tabela 6 de #50.
-
```
Yes, the approach presented in the ConsistentID paper could potentially be rearchitectured to find better solutions. Here are a few ideas for improving the architecture and methodology:
**Inte…
-
*Sent by @amir9979 (amir9979@gmail.com). Created by [fire](https://fire.fundersclub.com/).*
---
\---------- Forwarded message ---------
From: **Google Scholar Alerts**
Date: Sun, Mar 10,…
-
I use python 3.6.4, Pytorch1.0.0 & torchvision 0.2.1, scipy 1.2.1.
The results in paper 'Deep Cross-Modal Pojection Learning for Image-Text Matching' on CUHK-PEDES are:{top- 1 = 49.37%,top-10 = 79.2…
-
[확정]
1. 영문 쇼핑 웹사이트 사용
2. QA파트는 제외하고, 필요할시 해당 것을 넣도록 한다.(즉, 개발하다가 필요하면 넣을 것, 하지만 당장의 필요성은 애매모호하기때문에 제안서에는 작성하지 말자)
3. LoRRA모델을 베이스로 Train시키고, 만약에 해당 모델이 이미지 그자체에 대한 QA능력이 부족할시에 다른 모델과 병행하는 등의
섞어…