issues
search
junrong1
/
sentiment
0
stars
0
forks
source link
readme
sentiment
All training is under the base BERT model which has 110M parameters.
The hyperparameters of training is in the log.txt.
The learning rate will not affect training too much.
The best batch size is from 32 to 64.