rafiepour / CTran

Complete code for the proposed CNN-Transformer model for natural language understanding.
https://github.com/rafiepour/CTran
Apache License 2.0
23 stars 2 forks source link

Issue while using bert-large-uncased #2

Closed manishhnnegi closed 4 months ago

manishhnnegi commented 4 months ago

Hi Rafiepour, I tried to trained the CTRAN using bert-large-uncased for ATIS dataset. but its performance is poor as compared to when using bert-base-uncased embeddings. what could be the reason. and while downloading the bert-large-uncased files from hugging face there is no hubcof.py file. I m using the same hubconf.py file used in bert-large-uncased.

the result after 11 epoch is below best model at epoch: 11 max single SF F1: 0.5989 max single ID PR: 0.7079 which is very low compared to the bert-base-uncased where I got good results. loss is also high loss:2.0106

what might be the possible reasons for it

rafiepour commented 4 months ago

Hi Have you tried adjusting the learning rates? Since you are stating that bert base works fine, the only factor that comes to my mind is the learning rate. Try ones mentioned in the paper.

manishhnnegi commented 4 months ago

thanks rafiepour I tried with mentioned learning rate in paper. now its metrics are almost equal to what is mentioned in the paper