vi3k6i5 / flashtext

Extract Keywords from sentence or Replace keywords in sentences.
MIT License
5.59k stars 599 forks source link

Trigger multiple entries by same keyword? #56

Closed easonnie closed 6 years ago

easonnie commented 6 years ago

I was trying to use a key word dict like this:

from flashtext import KeywordProcessor
keyword_processor = KeywordProcessor()
keyword_dict = {
    "java": ["java_2e", "java programing"],
    "product management": ["PM", "java_2e", "product manager"]
}

I thought the keyword "jave_2e" would trigger both "java" and "product management".

However, the output for the following code is:

keyword_processor.extract_keywords('I am a programmer for a java_2e platform')

Output:

['product management']

Expected output:

['java', 'product management']

It seems to be confused. I was wondering what is the correct way to trigger multiple entries by the same keyword.

rmNULL commented 6 years ago

Here's a way to do it. keyword_processor.add_keyword('java_2e', ['java', 'product management', 'big word'])

EDIT: This is in the docs.

easonnie commented 6 years ago

Thanks a lot!

leobeeson commented 2 years ago

Here's a way to do it. keyword_processor.add_keyword('java_2e', ['java', 'product management', 'big word'])

EDIT: This is in the docs.

Hi @rmNULL. Could you please share the link to the docs where I can find this example? I've looked here and here but can't find a add_keyword with a str, List[str] signature. Thanks.

rmNULL commented 2 years ago

Hi @leobeeson the comment is outdated. I haven't used this software since long, the current way to do this looks like this on the documentation

 from flashtext import KeywordProcessor
 keyword_processor = KeywordProcessor()
 keyword_dict = {
     "java": ["java_2e", "java programing"],
     "product management": ["PM", "product manager"]
 }
 # {'clean_name': ['list of unclean names']}
 keyword_processor.add_keywords_from_dict(keyword_dict)
 # Or add keywords from a list:
 keyword_processor.add_keywords_from_list(["java", "python"])
 keyword_processor.extract_keywords('I am a product manager for a java_2e platform')
# output ['product management', 'java']
leobeeson commented 2 years ago

Thanks @rmNULL, though your answer still works, particularly for the use case in the original question of triggering multiple entries by a same keyword (i.e. one-to-many instance-to-entity mapping).

I tried your sample code and still runs in flashtext 2.7, but I couldn't find the add_keyword with a str, List[str] signature (which allows for one-to-many instance-to-entity mapping) in the current documentation.

If we try to do this using the example in the current documentation, flashtext only returns one of the entries ("clean_name") with the desired keyword "unclean name").

Once again, thanks!