Closed coastcode closed 1 month ago
原实验中是否有采取early stop策略?还是每个数据集都训练到40000 steps?
Please contact the corresponding author: ho.hin.lee@Vanderbilt.Edu
原实验中是否有采取early stop策略?还是每个数据集都训练到40000 steps?