Open bohnpessatti opened 4 years ago
Thanks for the kind words, and I'm glad this package has been of use to you!
It sounds like it would be a useful addition to the package. I don't have time to add these changes at the moment, as I no longer work with this application for my job. If you'd like to create a pull request with the changes I'd be happy to review it and add it, but otherwise I'll try and add it when I've got some time for it.
Congratulations for the initiative, your project it's being quite useful in my work.
I would like to suggest adding a function for the BM25F method, which takes different document fields relevance into account before using BM25 saturating function.
This avoids dangerous over-estimation of terms importance when combining linearly BM25 scores from different fields [1]. Therefore, it could make your project more robust for structured text ranking.
References: [1] https://trec.nist.gov/pubs/trec13/papers/microsoft-cambridge.web.hard.pdf [2] https://www.researchgate.net/publication/221613382_Simple_BM25_extension_to_multiple_weighted_fields
Thank you in advance.