Closed aaronbriel closed 3 years ago
Hey finetuning these models always has some differences due to random initialization. We also did not use the haystack Tutorial Tutorial2_Finetune_a_model_on_your_data for it, so there might be differences in parameters. For the parameters used please have a look here: https://huggingface.co/deepset/roberta-base-squad2-covid#hyperparameters
Could you give more insights into how large the differences are? If the difference is substantial it might be an haystack related bug, so please also open an issue there.
After further analysis I've decided to actually go with a different Covid dataset, but I will follow the process detailed in the document you shared. I don't believe there is an actual bug here so I'm closing. I will certainly let you know the results though. Thanks for sharing that!
I'm not seeing the same results after fine-tuning roberta-base-squad2 with COVID-QA.json as compared to running with your deepset/roberta-base-squad2-covid model. I followed the tutorial Tutorial2_Finetune_a_model_on_your_data. In your paper "COVID-QA: A Question Answering Dataset for COVID-19" I didn't see any specific hyperparameters used for training that might explain these differences. Did you train with default parameters?
Thanks!