tensor size mismatch problem

salesforce / DNNC-few-shot-intent

Discriminative Nearest Neighbor Few-Shot Intent Detection by Transferring Natural Language Inference

MIT License

46 stars 11 forks source link

tensor size mismatch problem #9

Open wangjade54241 opened 12 months ago

wangjade54241 commented 12 months ago

I met the abovetensor size mismatch problem when running the pretrain_dnnc.py. It seems to perform label smoothing to the target labels. The shape of label_distribution.unsqueeze(0) is torch.Size([1, 2]) The shape of target_distribution is torch.Size([32, 68, 768]). Could you please check the code……

jianguoz commented 12 months ago

Hi @wangjade54241, thanks for your interest in the work. I don't have such issues to run the code.

Are you following the code in https://github.com/salesforce/DNNC-few-shot-intent#21-train-and-evaluate-dnnc?, rather than running pretrain_dnnc.py directly. i.e.,

python pretrain_dnnc.py \
--train_file_path ./data/nli/all_nli.train.txt \
--dev_file_path ./data/nli/all_nli.dev.txt \
--do_lower_case \
--model_dir_path ./roberta_nli/

The size of target distribution is torch.Size([32, 2]) and the size of label_distribution.unsqueeze(0) is torch.Size([1, 2]).

wangjade54241 commented 12 months ago

I got the same error when running: python pretrain_dnnc.py \ --train_file_path ./data/nli/all_nli.train.txt \ --dev_file_path ./data/nli/all_nli.dev.txt \ --do_lower_case \ --model_dir_path ./roberta_nli/

Does this have anything to do with my use of the roberta-base model downloaded offline on huggingface? Because hugging face couldn't connect online...

wangjade54241 commented 12 months ago

Finally, I found that the cause of the error was that the model code loaded offline did not use num_labels=self.num_labels, causing the model dimension to change. Now the code can run normally, thank you for your attention.

jianguoz commented 12 months ago

Hi @wangjade54241, sorry to hear that huggingface could not connect online... So far I haven't seen there are anyone cannot run our code successfully.

There are two possible ways:

does it work if you replace https://github.com/salesforce/DNNC-few-shot-intent/blob/master/pretrain_dnnc.py#L24C26-L24C38 with your local download/saved absolute path of roberta-base? To do so, you can also add --bert_model <your path>. If you still have the same error, could you provide the full error log starting from the running command in text.
if you want to use our pretrained checkpoint, you can download it through the url in https://github.com/salesforce/DNNC-few-shot-intent#21-train-and-evaluate-dnnc.

jianguoz commented 12 months ago

Finally, I found that the cause of the error was that the model code loaded offline did not use num_labels=self.num_labels, causing the model dimension to change. Now the code can run normally, thank you for your attention.

MARK AS SOLVED: I am glad that you resolved the issue! In future, you maybe need to pay attention to the path if there are connection issues to huggingface.

hassyGo commented 12 months ago

Thanks for trying our method!

One note is that, we were mainly using standard terminals to run experiments/tests, instead of using IDEs or notebooks.

wangjade54241 commented 12 months ago

2. ou want to use our pretrained checkpoint, you can download it through the url in https://github.com/salesforce/DNNC-few-shot-intent#21-train-and-evaluate-dnnc.如果您想使用我们的预训练检查点，您可以通过 https://github.com/salesforce/DNNC-few-shot-intent#21-train-and-evaluate-dnnc 中的 URL 下载它。

Thank you~