Yoctol / purewords

Create pure sentences
3 stars 2 forks source link

Filters should be invertible #37

Open SoluMilken opened 7 years ago

SoluMilken commented 7 years ago
import re
store_dict = {replacement: []}
store_dict[replacement] = re.findall(pattern, sentence)
filtered_sentence = re.sub(.....)
tokenized_sentence = tokenizer.lcut(filtered_sentence)
for idx, token in enumerate(tokenized_sentence):
    if token == replacement:
        tokenized_sentence[idx] = store_dict[replacement][0]
        store_dict[replacement] = store_dict[replacement][1:]
SoluMilken commented 7 years ago

RRRR 還在想
如果要characterize level 這種句子該怎麼斷 ex: 我的電話號碼是phone 豪想把jieba的字庫清空