aboSamoor / polyglot

Multilingual text (NLP) processing toolkit
http://polyglot-nlp.com
Other
2.31k stars 337 forks source link

Polyglot entities not match to polyglot.sentences? #191

Open ghezalahmad opened 5 years ago

ghezalahmad commented 5 years ago

from polyglot.text import Text

file = open('input_raw.txt', 'r') input_file = file.read() file = Text(input_file, hint_language_code='fa') list_entity = [] for sentence in file.sentences: for entity in sentence.entities:

print(entity)

    list_entity.append(entity)

print(list_entity)

def check_sentence(entities_list, sentence): ## Check if string terms for term in entities_list: ## are in any of the entities

Pop the term with [0]

    if any(entity[0] == term for entity in sentence.entities):
        pass
    else:
        return False
return True

sentence_number = 0 # Which sentence to check sentence = file.sentences[sentence_number]

if check_sentence(entity_terms, sentence): print("Entity Terms " + str(list_entity[0]) +
" are in the sentence. '" + str(sentence)+ "'") else: print("Sentence '" + str(sentence) + "' doesn't contain terms" + str(list_entity[0]))

input file: sentence( Ashraf Ghani is the president of Afghanistan) entity-list: I-PER(['Ashraf', 'Ghani']), I-LOC(['Afghanistan']).

What do I want? I want to see if the entities in the entity-list exist in first or second or any sentence, it should print me out. Here the data format of entities and sentence are different. Can you help me what is wrong?