joshaven / string_score

JavaScript string ranking 0 for no match upto 1 for perfect... "String".score("str"); //=> 0.825
MIT License
840 stars 62 forks source link

What is it

See it in action

Check it out http://joshaven.com/string_score

Installation Notes

Simply include one of the string score JavaScript files and call the .score() method on any string.

NodeJS Installation

npm install --save string_score
require("string_score");

Thats it! It will automatically add a .score() method to all JavaScript String object... "String".score("str");

Examples

(results are for example only... I may change the scoring algorithm without updating examples)

"hello world".score("axl") //=> 0
"hello world".score("ow")  //=> 0.35454545454545455

// Single letter match
"hello world".score("e")           //=>0.1090909090909091

// Single letter match plus bonuses for beginning of word and beginning of phrase
"hello world".score("h")           //=>0.5363636363636364

"hello world".score("he")          //=>0.5727272727272728
"hello world".score("hel")         //=>0.6090909090909091
"hello world".score("hell")        //=>0.6454545454545455
"hello world".score("hello")       //=>0.6818181818181818
/* ... */
"hello world".score("hello worl")  //=>0.8636363636363635
"hello world".score("hello world") //=> 1

// Using a "1" in place of an "l" is a mismatch unless the score is fuzzy
"hello world".score("hello wor1")  //=>0
"hello world".score("hello wor1",0.5)  //=>0.6081818181818182 (fuzzy)

// Finding a match in a shorter string is more significant.
'Hello'.score('h') //=>0.52
'He'.score('h')    //=>0.6249999999999999

// Same case matches better than wrong case
'Hello'.score('h') //=>0.52
'Hello'.score('H') //=>0.5800000000000001

// Acronyms are given a little more weight
"Hillsdale Michigan".score("HiMi") > "Hillsdale Michigan".score("Hills")
"Hillsdale Michigan".score("HiMi") < "Hillsdale Michigan".score("Hillsd")

Tested And Operational Under all known environments (within reason)

Fully functional in the 100% of the tested browsers:

** IE 7 fails (stop running this script message) with 4000 iterations of the benchmark test. All other browsers tested survived this test, and in fact survive a larger number of iterations. The benchmark that is causing IE to choke is: 4000 iterations of 446 character string scoring a 70 character match.

Benchmarks

This is the fastest and smallest javascript string scoring script that I am aware of. I have taken great joy in squeezing every millisecond I can out of this script. If you are aware of any ways to improve this script, please let me know.

string_score.js is faster and smaller and does more than any other scripts that I am aware. Tests for liquidmetal.js or quicksilver.js are included in my test suite for direct comparison.

The test: 4000 iterations of 446 character string scoring a 70-character match

** Tests run with jQuery 1.5 on Mac Book Pro 2.4GHz Core 2 Duo running Snow Leopard *** quicksilver & string_score both use the same test file because they are used in the same way, LiquidMetal has to be called differently so the test file was modified to work with the LiquidMetal Syntax.

Ports

Please notify me of any ports so I can have them listed here. Please also keep track of the string score version that you have ported from. For example, in your readme include a note like: ported from version 0.2

Notes

string_score.js does not have any external dependencies other than a reasonably new JavaScript interpreter.

The tests located in the tests folder rely on the files located in the tests folder.

Credits

Author Joshaven Potter

Thank you Lachie Cox and Quicksilver for inspiration.

Contributors

Contributing Members: https://github.com/joshaven/string_score/network/members

License

Licensed under the MIT license.