Open jc-ryan opened 2 years ago
Thank you for your questions, i have added some details of processing the pre-training dataset in the README.md. I hope this could help you understand the pre-processing.
Thank you for your questions, i have added some details of processing the pre-training dataset in the README.md. I hope this could help you understand the pre-processing.
Thanks for your excellent work and patient feedback. Could you please release the processed pre-training data for better reproductivity?
请问对使用twitter_nlp工具没有抽出实体(方面术语)的样例,是删除了还是做了另外的处理?
请问最终预训练的数据量大概是多少
1.对于twitter_nlp工具没有抽出实体我们在预训练当中是作空处理的,因为下游任务上也有不存在实体的情况。 2.预训练的数据量大概是17000多。
Nice work and nice repository ! But I still have some doubts about the repository~
Thanks a lot!