xuyige / BERT4doc-Classification

Code and source for paper ``How to Fine-Tune BERT for Text Classification?``
Apache License 2.0
611 stars 99 forks source link

Question about Further Pre-training #7

Closed jcfeng closed 4 years ago

jcfeng commented 4 years ago

Hi: I tried to use your code on my own corpus to do classification which consists of many short sentences.I want to try some expriements with further pre-training without the NSP task.But from your code of "create_pretraining_data.py" ,I found you random choose a doc from the dataset to concatenate to another doc after [SEP] as input which confuse me a lot,could you please explain to me why this is done?Thanks a lot.