Question about Further Pre-training

Hi: I tried to use your code on my own corpus to do classification which consists of many short sentences.I want to try some expriements with further pre-training without the NSP task.But from your code of "create_pretraining_data.py" ,I found you random choose a doc from the dataset to concatenate to another doc after [SEP] as input which confuse me a lot,could you please explain to me why this is done？Thanks a lot.

xuyige / BERT4doc-Classification

Question about Further Pre-training #7