Open wsa-dot opened 2 years ago
Hello, I also get the same result in STS task, I also don't know the reason.
The result I got was only 65. I don't know what was wrong.
Hi, Do you find the reason for this result?
Maybe he is only good at theoretical analysis, but its hybrid method may not be really effective. We need to come up with some new ways to create really useful hard negatives
---Original--- From: @.> Date: Fri, Jun 10, 2022 16:03 PM To: @.>; Cc: @.**@.>; Subject: Re: [BDBC-KG-NLP/MixCSE_AAAI2022] experimental result (Issue #3)
The result I got was only 65. I don't know what was wrong.
Hi, Do you find the reason for this result?
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>
Maybe he is only good at theoretical analysis, but its hybrid method may not be really effective. We need to come up with some new ways to create really useful hard negatives … ---Original--- From: @.> Date: Fri, Jun 10, 2022 16:03 PM To: @.>; Cc: @.**@.>; Subject: Re: [BDBC-KG-NLP/MixCSE_AAAI2022] experimental result (Issue #3) The result I got was only 65. I don't know what was wrong. Hi, Do you find the reason for this result? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>
Yes, the theoretical analysis is very valuable, but then I'm curious as to how the results in the paper were derived. Because I basically followed the ReadMe to reproduce it, yet has a large gap with the paper result
The result I got was only 65. I don't know what was wrong.
Sorry, I have already seen it. Could you please show your hyparameters for training?
The result I got was only 65. I don't know what was wrong.
Sorry, I have already seen it. Could you please show your hyparameters for training?
Hi, thank you for your reply, here is my hyperparameters setting:
python train.py \ --model_name_or_path bert-base-uncased \ --train_file data/wiki1m_for_simcse.txt \ --eval_path data/sts-dev.tsv \ --output_dir $MODEL_PATH \ --num_train_epochs 1 \ --per_device_train_batch_size 64 \ --learning_rate 3e-5 \ --max_seq_length 32 \ --evaluation_strategy steps \ --metric_for_best_model stsb_spearman \ --load_best_model_at_end \ --eval_steps 125 \ --pooler_type cls \ --overwrite_output_dir \ --temp 0.05 \ --do_train \ --do_eval \ --seed 42 \ --lambdas 0.6 \
The result I got was only 65. I don't know what was wrong.
Sorry, I have already seen it. Could you please show your hyparameters for training?
hello, thanks for your reminder, I just set the lambda =0.2 as the paper, then I got an average STS = 77.20 using "cls" pooling, and a higher result STS = 77.90 using "cls_before_pooler", but I think I should follow your ReadMe file, and adopt "cls" pooling, right?
The result I got was only 65. I don't know what was wrong.
Sorry, I have already seen it. Could you please show your hyparameters for training?
hello, thanks for your reminder, I just set the lambda =0.2 as the paper, then I got an average STS = 77.20 using "cls" pooling, and a higher result STS = 77.90 using "cls_before_pooler", but I think I should follow your ReadMe file, and adopt "cls" pooling, right?
Yeah, I find the "cls" pooling is more robutness. And the script is updated now. Thank you for your reminder。
The result I got was only 65. I don't know what was wrong.