aosingh / lexpy

Python package for lexicon; Trie and DAWG implementation.
GNU General Public License v3.0
55 stars 7 forks source link

search_with_suffix function #4

Closed slee7268 closed 5 years ago

slee7268 commented 5 years ago

Do you think there's a way to implement a search_with_suffix function that looks for words in the DAWG that contain some suffix? Also is there a way to search the DAWG for words that contain a substring? For instance, if I wanted words that contained the substring "ST," the function would return "first," "star," and "sophisticated" Thanks!

aosingh commented 5 years ago

Hi @slee7268

For substring like usecase, will the search() method be useful ? I tested something like below. Let me know if this works:

from lexpy.dawg import DAWG
dawg = DAWG()
words = ["start", "first", "star", "thirst", "saturday", "sophisticate"]
dawg.add_all(words)
dawg.search("*st*") # wildcard search
['star', 'start', 'first', 'sophisticate', 'thirst'] # output
aosingh commented 5 years ago

Also, for search_with_suffix like use-case, the wild-card pattern should work. Do let me know if you feel this is not what you want.

from lexpy.dawg import DAWG
dawg = DAWG()
words = ["foobar", "crowbar", "metalbar", "coolbar", "bars"]
dawg.add_all(words)
dawg.search("*bar") # wild card char at the beginning. 
['coolbar', 'crowbar', 'foobar', 'metalbar']
slee7268 commented 5 years ago

Thanks that makes sense! Lexpy is awesome!!

aosingh commented 5 years ago

Glad it helped :)