Thanks for sharing your work. Tried to run the DrQA notebook, which has excellent descriptions by the way. Just tried to spin up an Azure ML instance Standard_NC6 (6 cores, 56 GB RAM, 380 GB disk) and GPU - 1 x NVIDIA Tesla K80, to see if I could replicate the results you list after 5 epochs, but get terrible performance. I suspect that for your training you might have used a different set of hyperparameters.
Thanks for sharing your work. Tried to run the DrQA notebook, which has excellent descriptions by the way. Just tried to spin up an Azure ML instance Standard_NC6 (6 cores, 56 GB RAM, 380 GB disk) and GPU - 1 x NVIDIA Tesla K80, to see if I could replicate the results you list after 5 epochs, but get terrible performance. I suspect that for your training you might have used a different set of hyperparameters.
The notebook contains the following:
I suspect that it might be different LR from the default learning rate of Adamax? Hope that you still remember something about the configuration :)