Closed innovate-invent closed 2 years ago
Thanks for the pull request. I think maybe a more robust solution is to modify the similarity function to add set sizes as new arguments. So we can use a different size than the set of tokens into the function. e.g., we can use the actual query set size rather than the size of the subset of tokens that exist in the index.
I made the required changes. Can you help me verify if the changes are correct by adding a unit test for your scenario? Thanks!
I ran the test on master and this branch, it fails on master and passes here.
I believe this is an effective fix, but I am not entirely sure what the consequences of using negative indices is.
Resolves #13