Open Hominnn opened 5 months ago
hello @Hominnn, the training code train_nli.py
is too old. It is recommended to use angle-trainer now.
I've updated the NLI document: https://github.com/SeanLee97/AnglE/blob/main/examples/NLI/README.md#41-bert You can find the new training script in the document.
To run it successfully,
1) please upgrade the angle-emb to the latest version via python -m pip install -U angle-emb
2) please use the latest evaluation code: https://github.com/SeanLee97/AnglE/blob/main/examples/NLI/eval_nli.py
3) if you want to push your model to huggingface, please set --push_to_hub 1
and specify a model id in your space via --hub_model_id xxx
. If not, set --push_to_hub 0
.
Here are the intermediate results (in about 9 epochs) of my run:
+-------+-------+-------+-------+-------+--------------+-----------------+-------+
| STS12 | STS13 | STS14 | STS15 | STS16 | STSBenchmark | SICKRelatedness | Avg. |
+-------+-------+-------+-------+-------+--------------+-----------------+-------+
| 75.59 | 84.83 | 80.37 | 86.26 | 81.96 | 85.12 | 80.70 | 82.12 |
+-------+-------+-------+-------+-------+--------------+-----------------+-------+
You can try to increase the epoch
, ibn_w
, or gradient_accumulation_steps
for better results.
I am still training several models with different hyperparameters; I will let you know the better hyperparameters when they are done.
hello @Hominnn, the training code
train_nli.py
is too old. It is recommended to use angle-trainer now.I've updated the NLI document: https://github.com/SeanLee97/AnglE/blob/main/examples/NLI/README.md#41-bert You can find the new training script in the document.
To run it successfully,
- please upgrade the angle-emb to the latest version via
python -m pip install -U angle-emb
- please use the latest evaluation code: https://github.com/SeanLee97/AnglE/blob/main/examples/NLI/eval_nli.py
- if you want to push your model to huggingface, please set
--push_to_hub 1
and specify a model id in your space via--hub_model_id xxx
. If not, set--push_to_hub 0
.Here are the intermediate results (in about 9 epochs) of my run:
+-------+-------+-------+-------+-------+--------------+-----------------+-------+ | STS12 | STS13 | STS14 | STS15 | STS16 | STSBenchmark | SICKRelatedness | Avg. | +-------+-------+-------+-------+-------+--------------+-----------------+-------+ | 75.59 | 84.83 | 80.37 | 86.26 | 81.96 | 85.12 | 80.70 | 82.12 | +-------+-------+-------+-------+-------+--------------+-----------------+-------+
You can try to increase the
epoch
,ibn_w
, orgradient_accumulation_steps
for better results.I am still training several models with different hyperparameters; I will let you know the better hyperparameters when they are done.
Thank you for your serious reply. Looking forwarding to more of your meaningful work!
Dear author, I want to use bert-base-uncased model to train on NLI dataset based on your method for some research. Could you provide relevant training scripts so that I can better reproduce your experimental results? This is my training script, using the same data as your training. I cannot reproduce the evaluation effect of your angle-bert-base-uncased-nli-en-v1 model.
This is my evalution result on STS