Most notebooks in the challenge and papers that deal with similar problems seem to use bm25 instead of tfidf. I think the two approaches are very similar, I have not studied the differences yet but even before doing so I think we should probably implement bm25 as an alternative to tfidf.
Most notebooks in the challenge and papers that deal with similar problems seem to use bm25 instead of tfidf. I think the two approaches are very similar, I have not studied the differences yet but even before doing so I think we should probably implement bm25 as an alternative to tfidf.
References