Closed hunzhizi closed 3 weeks ago
Number of epochs is set to 6 during training. In the first 3 epochs, the BERT-base weight is updated but frozen after that (only updating the classification layer).
The data_size
parameter also matters (see issue here). The example training command uses a small data size of 10 only for demonstration purposes. For full evaluation, please consider set the data size to be the training dataset size.
Hello, while replicating the experiment, the model's accuracy is not very satisfactory. Could you please tell me the number of epochs used in the first stage (full-parameter training) and the second stage (fixing the weights of BERT-base)?