Closed MingyuKim-2933 closed 11 months ago
Hi,
Thanks for asking! A quick question, the repo offered two versions of LOVE and did you use the 768-dimensional model link?
Yes, I did the SST2 task using the 768-dimensional model you provided.
I will do the reproduction, and it may take some time
I appreciate your big support!!
Hi,
I added some files for the reproduction of Table 4. Have a look at hereLOVE/extrinsic/bert_text_classification
.
The data
directory has all the data used in this experiment, including the samples with typos.
data/vocab.txt
contains all words in this experiment, including typos. You need to download the prepared word embeddings generated by LOVE (love.emb), and put it to the data
.
To reproduce scores for the original BERT:
python bert_plus_love.py --use_love False
To reproduce scores for BERT + LOVE:
python bert_plus_love.py --use_love True
We train the model by using five different learning rates, and record results of corresponding testing sets.
This is the average acc of five runs: | Model / typo rate | 0 | 10 | 20 | 30 | 40 | 50 | 60 | 70 | 80 | 90 |
---|---|---|---|---|---|---|---|---|---|---|---|
BERT | 91.3 | 90.4 | 87.3 | 84.3 | 81.5 | 77.0 | 73.7 | 69.8 | 64.8 | 58.7 | |
BERT+LOVE | 91.0 | 89.6 | 87.4 | 85.1 | 83.0 | 79.4 | 75.5 | 71.6 | 68.0 | 61.3 |
This is the max acc among five runs: | Model / typo rate | 0 | 10 | 20 | 30 | 40 | 50 | 60 | 70 | 80 | 90 |
---|---|---|---|---|---|---|---|---|---|---|---|
BERT | 91.6 | 90.9 | 87.8 | 85.0 | 82.0 | 77.5 | 74.6 | 70.6 | 66.0 | 59.3 | |
BERT+LOVE | 92.1 | 90.5 | 87.7 | 86.1 | 84.0 | 80.7 | 76.9 | 73.2 | 70.7 | 63.1 |
We can observe that adding LOVE can make BERT robust. The scores might be slightly different from the reported scores in our paper due to the following reasons:
You can first run the code on datasets I provided to reproduce the score. If it works, then you can run it again on your constructed dataset.
Details of BERT results = [[0.91044887039239, 0.9023669738406659, 0.8746655766944115, 0.8439543697978598, 0.8151753864447087, 0.7754161712247325, 0.745671076099881, 0.6986846016646848, 0.6482424197384067, 0.5930625743162901], [0.9097986028537455, 0.9006391200951248, 0.8775824910820451, 0.8497696195005946, 0.8200245243757431, 0.769266498216409, 0.7405060939357908, 0.7054659631391201, 0.6593527051129607, 0.5932669441141498], [0.916375594530321, 0.9023669738406659, 0.8710054994054697, 0.8481532401902497, 0.8114038347205708, 0.7716446195005946, 0.7386667657550535, 0.7058003864447087, 0.6533145065398335, 0.5892910225921522], [0.9136816290130796, 0.9077549048751486, 0.8742382580261593, 0.8373773781212842, 0.8163644470868014, 0.7677615933412604, 0.733817627824019, 0.6937239892984541, 0.6403834720570749, 0.5845533590963139], [0.9147592152199762, 0.9088324910820451, 0.86777274078478, 0.8336058263971463, 0.8114038347205708, 0.7661452140309156, 0.7273521105826397, 0.6872584720570749, 0.6382282996432818, 0.5764714625445898]]
Details of BERT + LOVE results = [[0.9213362068965517, 0.9050609393579072, 0.8692776456599287, 0.8562351367419738, 0.8256354042806183, 0.7909296967895363, 0.735118162901308, 0.7092560939357908, 0.6522369203329369, 0.5920964625445898], [0.9087210166468489, 0.8947123959571938, 0.8731606718192627, 0.8607684304399524, 0.8391052318668253, 0.8035448870392391, 0.7661452140309156, 0.7256242568370986, 0.6916802913198573, 0.6305737217598097], [0.9109876634958383, 0.8969790428061831, 0.8743497324613555, 0.8455707491082045, 0.8230529131985731, 0.7829592746730083, 0.7528983353151011, 0.7090331450653984, 0.6736771700356718, 0.6126820749108205], [0.9005276456599287, 0.8883583531510106, 0.8742382580261593, 0.8439543697978598, 0.8205633174791914, 0.7866193519619501, 0.7506316884661118, 0.7023446789536266, 0.6736771700356718, 0.6034111177170035], [0.9065658442330559, 0.8969790428061831, 0.8774710166468489, 0.8493423008323425, 0.8404057669441142, 0.8065546967895363, 0.768839179548157, 0.7319782996432818, 0.706766498216409, 0.6283070749108205]]
Thanks for your help, really appreciate :)
Hi, I've been trying to reproduce the performance in the paper for the SST2 task using the 'BERT+LOVE' embedding you provided. I tried changing various hyper-parameters in the model and modifying the code. However, I failed to reproduce the performance of the paper.
My reproduction performance is below.
Could you provide the code that performed the SST2 Task?
Thank you!