Have you tried using bi encoder self for hard negative mining? Like second stage of training QA model, after using tfidf negatives, or from the beginning (reducing source dependencies). Maybe it could converge into a better model. Or maybe it would be worse due to overfitting.
Thank you for the work and publishing the source code!
Have you tried using bi encoder self for hard negative mining? Like second stage of training QA model, after using tfidf negatives, or from the beginning (reducing source dependencies). Maybe it could converge into a better model. Or maybe it would be worse due to overfitting.
Thank you for the work and publishing the source code!