snehac-miner / redis

Automatically exported from code.google.com/p/redis
0 stars 0 forks source link

[Feature Request] Tries #110

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
Tries (or something similar) would be a great feature to have.

In particular, it would be great to have Tries where the nodes are lists to
make it easy to implement autocomplete on textboxes.  Lots of sites have
autocomplete but, as-is, it's hard to make it fast.

Thanks--redis is great!
D

Original issue reported on code.google.com by d...@dangould.com on 22 Nov 2009 at 3:38

GoogleCodeExporter commented 8 years ago
BTW: This article shows some of the justification
http://sujitpal.blogspot.com/2007/02/three-autocomplete-implementations.html

Original comment by d...@dangould.com on 22 Nov 2009 at 3:41

GoogleCodeExporter commented 8 years ago
Also of interest:
http://suggesttree.sourceforge.net/
http://www.supermind.org/blog/530/trie-based-approximate-autocomplete-implementa
tion-with-support-for-ranks-and-synonyms

Original comment by d...@dangould.com on 22 Nov 2009 at 4:14

GoogleCodeExporter commented 8 years ago
Why don't use sorted sets for this?

Original comment by anti...@gmail.com on 22 Nov 2009 at 2:00

GoogleCodeExporter commented 8 years ago
If I understand what you're thinking, sorted sets provide a good part of the
solution, but if you want to find all substring matches, you need to store lots 
of
copies of the same data.  For the string "antirez", I'd have to store it in the 
node
for "a", "an", "ant", "anti", etc.OR you could try to traverse all substrings 
in a
range.  In one case, it would require massive memory (and pre-computation); in 
the
other, it would require tons of lookups.

Thanks Antirez.

Original comment by d...@dangould.com on 25 Nov 2009 at 12:47

GoogleCodeExporter commented 8 years ago
I implemented LSEARCH to handle this use case. You can see the commit here: 
http://github.com/dbravender/redis/commit/f06354773877e32f0c4b2d8f41493677d92334
36

You can call LSEARCH [listname] [pattern] and it will search all the elements 
of the 
list and only return items that match the pattern (with * substitution). So in 
the 
case of autocompletes you can take what the user entered and add a * at the end 
to 
see all of the potential matches.

Here's a sample:
$ telnet localhost 6379
Trying ::1...
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
lpush alist 3  
baz
+OK
lpush alist 3
bar
+OK
lpush alist 3
foo
+OK
lsearch alist b*
*2
$3
bar
$3
baz

Original comment by dan.brav...@gmail.com on 25 Dec 2009 at 9:47

GoogleCodeExporter commented 8 years ago
I changed the name to LMATCH in my latest commit to make it clear that you can 
search 
on a pattern. What do you think the best name for this feature is?

Original comment by dan.brav...@gmail.com on 25 Dec 2009 at 11:28

GoogleCodeExporter commented 8 years ago
Hello Dan,

sorted sets should be able to fix this issues, or if they are not, should be 
extended to fix this issues.

I'm currently not closing this issue to reason on it better for some time, btw 
what you can already use is converting the first N digits of the words you are 
storing into a score, with sorted sets, and then get ranges of words by score.

Cheers,
Salvatore

Original comment by anti...@gmail.com on 23 Aug 2010 at 4:06

GoogleCodeExporter commented 8 years ago
Support for tries would be terrific. Sorted sets / search are not quite the 
same and not as simple in terms of interface.

Original comment by jorangr...@gmail.com on 9 Sep 2010 at 8:48

GoogleCodeExporter commented 8 years ago
Guys you can perform autocomplete with sorted sets, just do the following:

take the first 4 or 5 digits of the word and turn it into a score. Done...

The obvious way to do this is considering every digit as a radix 
<number-of-letters> number. So you have A*radis^0+B*radis^1+C...

It's pretty trivial.

Well not only this, Redis sorted sets are using skiplists, so we have all the 
operations of a balanced tree (with the same time complexity). So this can be 
provided already in a more easy to use interface. Another approach could be to 
allow the sorted set score to be not just a number, but also a string, using 
different sets of commands, but all this seems a bit confusing given that there 
is a more direct solution of turning strings into scores.

Cheers,
Salvatore

Original comment by anti...@gmail.com on 9 Sep 2010 at 1:54

GoogleCodeExporter commented 8 years ago
I'd like to put a bounty on a patricia trie that would help both the folks 
dealing with subsets in completion and spelling as well as folks manipulating 
routing tables of IP addresses, CIDR blocks and blacklists of both domains and 
IP addresses. If somone is willing to write the code I'd like to help fund a 
persons effort on integrating a patricia trie in the likes currently available 
in c++, java, perl, python and ruby.

contact rick@support-intelligence.com

Original comment by rick.wes...@iidf.org on 8 Dec 2010 at 6:16