Closed radekstepan closed 13 years ago
mention the complexity of search determining approx max time for a given index size of x articles of y words
Stop words:
"a", "an", "and", "are", "as", "at", "be", "but", "by", "for", "if", "in", "into", "is", "it", "no", "no", "not", "of", "on", "or", "s", such", "t", "that", "the", "their", "they", "then", "there", "these", "this", "to", "was", "will", "with"
include Porter stemmer in py and js to get not just exact results and save on space
Turn ftp://ftp.ox.ac.uk/pub/wordlists/dictionaries/knuth_words.gz into an array and determine speed in JavaScript. If this is fast, anything will.
Utilize http://api.jquery.com/category/plugins/templates/ if success with binary search
http://www.nczonline.net/blog/2009/09/01/computer-science-in-javascript-binary-search/