I am curious about how to train the GPT-2 on a Question/Answer dataset. From my understanding, the sample_sequence.py will take a corpus, and randomly break that corpus into 2 parts, and the goal of the network is to predict the second part using the 1st. Is this a correct understanding of the training cycle?
Because then, that means instead of sampling randomly text wise, I randomly sample question/answer wise. Where a question is the first part, and the answer is the second part correct?
I am curious about how to train the GPT-2 on a Question/Answer dataset. From my understanding, the
sample_sequence.py
will take a corpus, and randomly break that corpus into 2 parts, and the goal of the network is to predict the second part using the 1st. Is this a correct understanding of the training cycle?Because then, that means instead of sampling randomly text wise, I randomly sample question/answer wise. Where a question is the first part, and the answer is the second part correct?