I could see different results are returned when using methods extractOne and extractTop on the same query string and collections.
I have a pretty long list of collection (15k Strings) to search for each query.
For Instance, let's say I have the following scenario
Query - ABC 1721
The collection has following strings in it
ABC1721
ABC1721-FGH/L9
ABC MERAKI Z1
EFGD3111/Z1-ABC
and many more
extractOne("ABC 1721", collection)
gives - ABC1721, Ratio - 95
extractTop("ABC 1721", collection,1)
gives - ABC1721, Ratio - 95
but the problem arose when I want the top 5 results
extractTop("ABC 1721", collection,5)
Match 1 - ABC1721-FGH/L9, Ratio - 86
Match 2 - ABC MERAKI Z1, Ratio - 86
Match 3 - EFGD3111/Z1-ABC, Ratio - 86
and so on
I tried using 'extractSorted' as well, it doesn't give consistent results as extractOne.
I used extractTop (for top 5) and extractOne for 1000+ queries. Around 70% of the 1st Match from extractTop doesn't match with the result of extractOne
BTW, I would like to appreciate your efforts on porting the python logic to Java without any performance lag
I could see different results are returned when using methods extractOne and extractTop on the same query string and collections.
I have a pretty long list of collection (15k Strings) to search for each query.
For Instance, let's say I have the following scenario Query - ABC 1721 The collection has following strings in it ABC1721 ABC1721-FGH/L9 ABC MERAKI Z1 EFGD3111/Z1-ABC and many more
extractOne("ABC 1721", collection)
gives - ABC1721, Ratio - 95extractTop("ABC 1721", collection,1)
gives - ABC1721, Ratio - 95but the problem arose when I want the top 5 results
extractTop("ABC 1721", collection,5)
Match 1 - ABC1721-FGH/L9, Ratio - 86 Match 2 - ABC MERAKI Z1, Ratio - 86 Match 3 - EFGD3111/Z1-ABC, Ratio - 86 and so onI tried using 'extractSorted' as well, it doesn't give consistent results as extractOne.
I used extractTop (for top 5) and extractOne for 1000+ queries. Around 70% of the
1st Match
fromextractTop
doesn't match with the result ofextractOne
BTW, I would like to appreciate your efforts on porting the python logic to Java without any performance lag