-
Hi, I am looking for the 22k-supervised fine-tuning ConvNeXt-V2-H model without 1k-supervised fine-tuning. I want to use it to fine-tune on ade20k, reproducing the result in Table 7 of the paper.
-
@shwoo93 @s9xie hi, did you make a comparision about the effect of image reconstruction of the pretrained V1/V2 model? How about comparing with MAE-pretrained ViT?
-
How can I use GraphCL for fully unsupervised Graph Clustering?
So far, all that I've found the method for Graph Clustering is actually for node clustering or not a fully unsupervised learning metho…
-
Dear authors,
First, I want to applaud you for your great paper acceptance, and thank you for making the model weights available. I read your paper carefully, but I still have questions I hope you …
-
**What's the issue, what's expected?**:
Error when using ms-amp to do llm sft.
ms-amp deepspeed config:
"msamp": {
"enabled": true,
"opt_level": "O1|O2|O3", # all tried
"use_te": false
}
…
-
I used as readme of deepspeed chat.
training/step1_supervised_finetuning/training_scripts/single_node/run_1.3b.sh
training/step2_reward_model_finetuning/training_scripts/single_node/run_3…
-
**Describe the bug**
I am getting the following error while attempting to run deepspeed-chat step 3 with the actor model CarperAI/openai_summarize_tldr_sft (gpt-j 6B) and critic model CarperAI/openai…
-
We are following the concerns being raised about this study both publicly on this forum (#23, #20, #21), on pubpeer (https://pubpeer.com/publications/C8CFF9DB8F11A586CBF9BD53402001), and privately. Mo…
-
## 論文リンク
- [arXiv](https://arxiv.org/abs/2304.07193)
- [github](https://github.com/facebookresearch/dinov2)
## 公開日(yyyy/mm/dd)
2023/04/14
## 概要
### Research Question
研究で明らかにしたい問を端的に表したも…
-
### Feature request
Allow passing a 2D attention mask in `model.forward`.
### Motivation
With this feature, it would be much easier to avoid cross-context contamination during pretraining and super…