-
@dingjiansw101 Hi Jian, thanks for your great work! I am wondering did you happen to test your trained coco-stuff model directly on the ADE-20K dataset? Because in the concurrent works, like [1][2], t…
-
I thought Imagenet 21k is a superset of imagenet-1k, as written in the ViT paper.
If Imagenet-21K is allowed for pre-training, I assume the evaluation on imagenet1k cannot be considered as zero sho…
-
1. Flickr30k and MSCOCO retrieval
2. `section 4.4` Transfer Learning Results
Thanks a lot.
-
### 🚀 The feature, motivation and pitch
Recently the Maximal Update Parametrization ([muP, arxiv 2203.03466](https://arxiv.org/abs/2203.03466)) is becoming prevalent in large model training becaus …
-
- meaning of `num_transfer_steps` in https://github.com/hyunjimoon/24_transpo/blob/f47120b11d764bf07b0340f22358f55cfe058041/CP3/analysis/utils.py#L95
episode 한번 policy update (Q-learning에서 Q matrix…
-
![image](https://user-images.githubusercontent.com/46675408/151471976-80ecc306-5480-4f02-9672-848ca1aec80e.png)
[article](https://openai.com/blog/clip/), [paper](https://arxiv.org/abs/2103.00020), [c…
-
## ざっくり言うと
GPT-3などのzero-shotで使われているpromptingの考えと、pretrain-finetuneの考えを組み合わせた"instruction tuning"を提案した。"instruction tuning"は入力文内にタスク内容の説明文を含める学習方法で、タスクの説明文からその問題の解き方を学習させたいという意図がある。結果としてzero-shotの精度を向…
-
想了解一下有考虑过用pointnet之类的学到的point cloud global feature加上全连接层做finetune然后与clip中的text feature比较这样的尝试吗,还是说因为clip中image encoder和text encoder学到的特征是对齐的,所以直接考虑了2d depth maps projection的思路,如果有考虑过前者的话,是效果不好吗?
-
May I ask if this implementation of the model has been experimented on the MEL spectrum.? I used
Transformer model with only convolutional positional coding added at the beginning to get discontinuou…
-
Building multilingual models (zero-shot, transfer learning, etc.) takes time.
So, in the meantime, as stated in #2 , we could machine-translate FAQs from English into other languages and add them t…