Closed dh3014 closed 9 years ago
Sounds good! Can we separate the algorithms by an option and use yours by default ? Thanks!
To make the algorithm plug-able, we need to decide the semantics of the interface.
I think it's not worth to encapsulate a Strategy Class here, a simple callback
would be fine. But what does the callback means semantically?
For example, levenshtein()
outputs DISTANCE of strings, so we can define exact match by DISTANCE=0
, and require DISTANCE as small as possible. similar_text()
on the other hand outputs SIMILARITY of strings, so greater is better, it can also outputs RELATIVE SIMILARITY, so an exact match means RELATIVE SIMILARITY = 100%
We need to choose one of it, so any person wants to use their own algorithm has to follow up.
OK, I think we can just rename the current Corrector
to LevenshteinCorrector
and let the similar_text
corrector replace the current one.
I will work on the DI container for injecting services in the future, then we can let users to register their own corrector.
DI container is great, but there are thousands of DI implementations on [packist.org]. Many of them works in different way (by different purpose, too), would be hard to evaluate :p.
I'll send PR perhaps few days later.
Surveying projects are pretty time consuming, I would prefer to start from a small one. :-p
Cool, that would be great, Thanks for your work!
I am closing this since you've done this task. :)
I think current
Corrector::guess()
is disappointing. I add the first command 'scheduled-job' in my app. Since I'm still using bash (no completion), I'm trying with Corrector feature. Here's what happened:After looking into source in
Corrector
, it useslevenshtein()
to make the best guess. I think this algo computes edit distance between two strings. However I don't think edit-distance is fit into suggesting feature, in fact I guess even LCS or LIS would be better in suggestion.To make it simple, I think another php built-in function
similar_text
[http://php.net/manual/en/function.similar-text.php] is much better in this case. It has the same time complexity with levenshtein (but slower a bit, yes).I did a little experiment:
What would you think about changing the suggestion algorithm?