kramar42 / kp-telegram-bot

0 stars 2 forks source link

Split text to tokens for bomb #4

Open eluppol opened 5 years ago

rbietin commented 5 years ago

While doing this how about we redesign a logic with a bomb string matching. There are several options to implement it:

  1. Full string match.
  2. startswith
  3. Fixed Levenshtein distance. Say whenever it is under 2 we say words are matched
  4. Relative Levenshtein distance. Say whenever it is under 7% of the bomb word length we say words are matched

all scenarios described above imply that we first do tolower plus fix #5

eluppol commented 5 years ago

We are doing tolower already. And we decided to try https://pypi.org/project/pymystem3/

rbietin commented 5 years ago

sgtm