Closed yygle closed 4 years ago
could i ask you about which of the pretrained model of offical bert you used, cuz i use the wwm_uncased_L-24_H-1024_A-16 model, and easily got an error of OOM.
Base 12 layer
could i ask you about which of the pretrained model of offical bert you used, cuz i use the wwm_uncased_L-24_H-1024_A-16 model, and easily got an error of OOM.