Morizeyao / GPT2-Chinese

Chinese version of GPT2 training code, using BERT tokenizer.
MIT License
7.48k stars 1.7k forks source link

语料预处理 #285

Open mgcyung opened 1 year ago

mgcyung commented 1 year ago

您好,请问如何预处理殊知阁古文?是否有相应的预处理脚本可用?谢谢!