You can also improve the performance of the pre-trained Reader, which was pre-trained on SQuAD 1.1 dataset. If you have an annotated dataset (that can be generated by the help of the cdQA-annotator) in the same format as SQuAD dataset you can fine-tune the reader on it:
# Put the path to your json file in SQuAD format here
path_to_data = './data/SQuAD_1.1/train-v1.1.json'
cdqa_pipeline.fit_reader(path_to_data)
Should we add our own training dataset that we constructed via cdQA-annotator into the "train-v1.1.json" file by editing it?
Hi,
in a blog you wrote:
Thanks!