machinalis / iepy

Information Extraction in Python
BSD 3-Clause "New" or "Revised" License
905 stars 186 forks source link

Rule Runner - Create Evidence #140

Open Ravi-Rao26 opened 6 years ago

Ravi-Rao26 commented 6 years ago

Hi,

Trying to get the rule based classifier working but currently unable to do so.

I run the command python ./rules_verifier.py WhatisthePolicyNumberIssued? I get the output - Matches for rule is_PolicyNumber

Nothing matched

I run the command python ./iepy_rules_runner.py Output - Relations not defined in the rule file.

Here is one sample rule that I have defined in rules.py file, which I have placed in /bin folder of IEPY project & outside the bin folder as well.


from refo import Question, Star, Any, Plus from iepy.extraction.rules import rule, Token, Pos

Relation = "WhatisPolicyNumberIssued?"

@rule(True) def is_PolicyNumber(Subject, Object): """ POLICY NUMBER:1234 """ anything = Star(Any()) return Subject + Token(":") + Object + anything


I have loaded one sample corpus document in IEPY database and have also run preprocess.py. I have also done evidence labelling as well. However if I check the corpus_iedocument table,the token column has this value :- 'COMPANY' 'ABC','POLICY', 'NUMBER', ':', '1234', 'RENEWAL', 'OF', 'POLICY','-','NEW','OWNER', 'NAME' ....

But still the command 'python ./rule_verifier.py WhatisPolicyNumberIssued? gives the output - Nothing matched,

Also can anyone please illustrate the use of rules_verifier.py with create evidence option. How does it work?

Currently seen the operation of the Active Learning core. Once it is trained , for predicting the relations between entities on new unseen text, the text has to be inserted into database and preprocess.py has to be executed. Then if we run the active learning core trained classifier with option --no-questions, then too for this text, it will not predict anything as there are no candidate evidence. Creation of candidate evidence is a manual activity, which is done during training phase where in the defined entities are annotated/labelled. Here is what I was thinking if the process of creating candidate evidence could be automated through the rules_verifier.py and then run the trained classifer on the annotated set of entities to predict if relation exists or not.

Cheers, Ravi