Closed n-gram-hub closed 2 years ago
Thanks for the report. Could you provide a list of data and the Fuse configuration you used? This would make it easier to reproduce. :)
Sorry for the late reply: :-)
<?php
require_once 'vendor/autoload.php';
$basic_data = [
["w" => "cd", "d" => "CD"],
["w" => "dvd", "d" => "DVD"],
["w" => "hi-fi", "d" => "HI-FI"],
["w" => "blast chiller", "d" => "Blast chiller"],
["w" => "clothing", "d" => "Clothing"],
["w" => "suit", "d" => "Suit"],
["w" => "lighter", "d" => "Lighter"],
["w" => "steel", "d" => "Steel"],
["w" => "accumulator", "d" => "Accumulator"],
["w" => "acetone", "d" => "Acetone"],
["w" => "acid", "d" => "Acid"],
["w" => "white spirit", "d" => "White spirit"],
["w" => "acquarium", "d" => "Acquarium"]
];
$options = ['keys' => ['w', 'd'], 'includeScore' => true];
$fuse = new \Fuse\Fuse($basic_data, $options);
$s = $fuse->search("acid");
echo "<pre>";
print_r($s);
echo "</pre>";
?>
Okay, I checked your example. In fact, all scores are between 0 and 1, as they should be.
Take a closer look at the first score: It's actually 4.9303806576313E-32
.
In case you never encountered it: This way of writing numbers is called E Notation and is used for very small and very large numbers. In your case, that's a number with 31 zeros after the decimal point.
That said, this is indeed a bug: The score should not be this ridiculously low, it should be zero (i.e. an exact match). This is a duplicate of #26 which will not be fixed (for now) because the bug is also present in Fuse.js at krisk/Fuse#481. Unfortunately, that issue has been (automatically) closed without being fixed.
I'm not fixing bugs that root in Fuse.js as this is a port without any own implementation. This is because I don't even understand the search logic at all, I'm really just reimplementing the original JS source in PHP. I won't risk deviating from a project I would not be able to maneuver on my own.
Therefore, I'll close this as a duplicate. Sorry I couldn't help.
Thanks for your kind reply. Anyway, since some of my client code cuts some characters, I've not noticed the notation issue, so I was quite sure it was 4.9... (>1), not that incredible number. :-)
I tried to use includeScore but it behaves differently than expected: that is, it doesn't return 0 or 1, but a float > 1 for the first match and a float between 0 and 1 for any other match.