michaelsproul / rust_radix_trie

Fast generic radix trie implemented in Rust
https://docs.rs/radix_trie/
MIT License
184 stars 32 forks source link

Question - how to use this to search for common strings based on prefix #70

Open avnerbarr opened 2 years ago

avnerbarr commented 2 years ago

Hi,

I'm a bit confused as to how this is supposed to be used.

My example is as follows:

I have many entries (sentences/paragraphs etc.)

given an arbitrary prefix, I would like to retrieve all relevant matches from the trie.

For example:

Trie construction by these phrases

1. Hello I am happy
2. Hello I am sad
3. Hello I am afraid
4. Hello you are sad
5. Hello you are happy

If I understand correctly the trie would compress along these lines

                                                                   │ compressed
         ┌─────────────────────────►   Hello                       │
         │                              │   │                      │    │
         │                              │   │                      │    │
         │                              │   │          ◄───────────┘    │
         │ ┌─────────────────────► I ◄──┘   └──► You                    │
compressed │                        │              │          ◄─────────┘
                                    │              └──► are
           │                        ▼                     ▼  │
           └───────────────────►  am│                  sad └──────►  happy
                                     ┌┤
                                 │   ▼│
                        happy◄───┘ sad└─────►afraid

Now given a partial phrase such as: Hello I

I would like to be able to retrieve the results for 1,2,3 (since they share the prefix Hello I)

Can you point me in how to achieve this?

Thanks!

Drvanon commented 2 years ago

I have a similar issue. I would like to run an algorithm based on the common prefix, but as far as I have been able to read the documentation, the common prefix is not able to be accessed, only the nibble vec that represents it.

vemonet commented 9 months ago

@avnerbarr @Drvanon I also faced the same issue, in case that might be useful to the next person searching: I reused and improved an existing implementation for a generic trie without any dependencies. I added functions to easily find prefixes or postfixes in the trie and packaged it there: https://github.com/vemonet/ptrie