vi3k6i5 / flashtext

Extract Keywords from sentence or Replace keywords in sentences.
MIT License
5.57k stars 598 forks source link

Keyword not found - multi-word keywords #121

Closed nfrancioso closed 2 years ago

nfrancioso commented 3 years ago

A single word keyword is not being found when its part of a multi-word keyword that is also part of the keyword list. (not sure how to explain this or if it makes sense but please see example below) Thanks.

    keyword_processor = KeywordProcessor()
    keyword_processor.add_keywords_from_list(["Senior Management", "Program Management", "Management"])
    keywords_found = keyword_processor.extract_keywords('I am in a Senior Management as a program management specialist.')
    print(keywords_found) ## ['Program Management', 'Senior Management']

'Management' should be found in the text but it is not.

lc-billyfung commented 3 years ago

my understanding of the algo is that it returns longest match first

laomagic commented 2 years ago

kp = KeywordProcessor()

kp.add_keyword("手机") kp.add_keyword("苹果手机") kp.add_keyword("vivo") text = "手机vivo" words = kp.extract_keywords(text) print(words)

['vivo']

nfrancioso commented 2 years ago

not a bug. longest match first is returned