junrong1 / sentiment

0 stars 0 forks source link

readme

sentiment

All training is under the base BERT model which has 110M parameters.
The hyperparameters of training is in the log.txt.
The learning rate will not affect training too much.
The best batch size is from 32 to 64.