JakeBayer / FuzzySharp

C# .NET fuzzy string matching implementation of Seat Geek's well known python FuzzyWuzzy algorithm.
MIT License
645 stars 80 forks source link

Unexpected results with WeightedRatioScorer #47

Open josdemmers opened 11 months ago

josdemmers commented 11 months ago

Using the WeightedRatioScorer for two different strings, one of them gives an unexpected result.

Input

  1. +30.0% Damage to Close Enemies [30.01%
  2. +14.3% Damage to Crowd Controlled Enemies [7.5 - 18.0]%

Choices

Results

Scorer: WeightedRatioScorer
Input 1: +30.0% Damage to Close Enemies [30.01%

Main: (string: +#% Damage, score: 90, index: 0)
Main: (string: +#% Damage to Close Enemies, score: 90, index: 2)
Main: (string: +#% Damage to Chilled Enemies, score: 77, index: 3)
Main: (string: +#% Damage to Poisoned Enemies, score: 75, index: 4)
Main: (string: +#% Damage to Crowd Controlled Enemies, score: 67, index: 1)
Main: (string: +#% Cold Damage, score: 61, index: 8)
Main: (string: #% Damage Reduction from Bleeding Enemies, score: 59, index: 6)
Main: (string: #% Damage Reduction, score: 50, index: 7)
Main: (string: #% Block Chance#% Blocked Damage Reduction, score: 48, index: 5)
Elapsed time: 39
---
Scorer: WeightedRatioScorer
Input 2: +14.3% Damage to Crowd Controlled Enemies [7.5 - 18.0]%

Main: (string: +#% Damage to Crowd Controlled Enemies, score: 90, index: 1)
Main: (string: +#% Damage to Close Enemies, score: 73, index: 2)
Main: (string: +#% Damage to Chilled Enemies, score: 69, index: 3)
Main: (string: +#% Damage to Poisoned Enemies, score: 68, index: 4)
Main: (string: +#% Cold Damage, score: 61, index: 8)
Main: (string: +#% Damage, score: 60, index: 0)
Main: (string: #% Damage Reduction from Bleeding Enemies, score: 56, index: 6)
Main: (string: #% Damage Reduction, score: 50, index: 7)
Main: (string: #% Block Chance#% Blocked Damage Reduction, score: 40, index: 5)
Elapsed time: 0
---

For some reason input1 gives +#% Damage a score of 90. While for Input2 it works as expected and +#% Damage gets score of 60.

Source

Here is the source to reproduce the issue.

Click me ``` using FuzzySharp; using FuzzySharp.SimilarityRatio; using FuzzySharp.SimilarityRatio.Scorer.Composite; using System.Reflection; namespace FuzzySharpTest { internal class Program { static void Main(string[] args) { string input1 = "+30.0% Damage to Close Enemies [30.01%"; string input2 = "+14.3% Damage to Crowd Controlled Enemies [7.5 - 18.0]%"; List choices = new List() { "+#% Damage", "+#% Damage to Crowd Controlled Enemies", "+#% Damage to Close Enemies", "+#% Damage to Chilled Enemies", "+#% Damage to Poisoned Enemies", "#% Block Chance#% Blocked Damage Reduction", "#% Damage Reduction from Bleeding Enemies", "#% Damage Reduction", "+#% Cold Damage" }; // WeightedRatioScorer - input1 var watch = System.Diagnostics.Stopwatch.StartNew(); Console.WriteLine("Scorer: WeightedRatioScorer"); Console.WriteLine($"Input 1: {input1}"); Console.WriteLine(string.Empty); var results = Process.ExtractTop(input1, choices, scorer: ScorerCache.Get(), limit: 9); foreach (var r in results) { Console.WriteLine($"{MethodBase.GetCurrentMethod()?.Name}: {r}"); } watch.Stop(); var elapsedMs = watch.ElapsedMilliseconds; Console.WriteLine($"Elapsed time: {elapsedMs}"); Console.WriteLine("---"); // WeightedRatioScorer - input2 watch = System.Diagnostics.Stopwatch.StartNew(); Console.WriteLine("Scorer: WeightedRatioScorer"); Console.WriteLine($"Input 2: {input2}"); Console.WriteLine(string.Empty); results = Process.ExtractTop(input2, choices, scorer: ScorerCache.Get(), limit: 9); foreach (var r in results) { Console.WriteLine($"{MethodBase.GetCurrentMethod()?.Name}: {r}"); } watch.Stop(); elapsedMs = watch.ElapsedMilliseconds; Console.WriteLine($"Elapsed time: {elapsedMs}"); Console.WriteLine("---"); } } } ```
josdemmers commented 11 months ago

Found an alternative, it's a fork created from FuzzySharp. FuzzierSharp: https://github.com/AtriaStar/FuzzierSharp

That one gives me the expected results for the WeightedRatioScorer.

//edit Too bad, I'm also having issues with FuzzierSharp

Scorer: WeightedRatioScorer
Input 1: +30.0% Damage to Close Enemies [30.01%

Main: (string: +#% Damage to Close Enemies, score: 95, index: 2)
Main: (string: +#% Damage, score: 86, index: 0)
Main: (string: +#% Damage to Chilled Enemies, score: 77, index: 3)
Main: (string: +#% Damage to Poisoned Enemies, score: 75, index: 4)
Main: (string: +#% Damage to Crowd Controlled Enemies, score: 67, index: 1)
Main: (string: +#% Cold Damage, score: 61, index: 8)
Main: (string: #% Damage Reduction from Bleeding Enemies, score: 59, index: 6)
Main: (string: #% Damage Reduction, score: 57, index: 7)
Main: (string: #% Block Chance#% Blocked Damage Reduction, score: 48, index: 5)
Elapsed time: 50
---
Scorer: WeightedRatioScorer
Input 2: +29.2% Damage to Close Enemies [24.8 - 35.3]%

Main: (string: +#% Damage, score: 86, index: 0)
Main: (string: +#% Damage to Close Enemies, score: 86, index: 2)
Main: (string: +#% Damage to Chilled Enemies, score: 75, index: 3)
Main: (string: +#% Damage to Poisoned Enemies, score: 75, index: 4)
Main: (string: +#% Damage to Crowd Controlled Enemies, score: 64, index: 1)
Main: (string: +#% Cold Damage, score: 61, index: 8)
Main: (string: #% Damage Reduction, score: 57, index: 7)
Main: (string: #% Damage Reduction from Bleeding Enemies, score: 53, index: 6)
Main: (string: #% Block Chance#% Blocked Damage Reduction, score: 43, index: 5)
Elapsed time: 0
---

@Saalvage do you accept/want bug reports on your fork? Issues are currently disabled.

Saalvage commented 11 months ago

Bug reports - no, the only reason I forked FuzzySharp was to fix/improve upon some aspect which I needed for a different project Pull requests - sure, as long as they're relatively trivial I wouldn't mind merging them and pushing a new release to nuget if it helps someone