RTXteam / RTX

Software repo for Team Expander Agent (Oregon State U., Institute for Systems Biology, and Penn State U.)
https://arax.ncats.io/
MIT License
33 stars 21 forks source link

QuestionTranslater gets stuck #35

Closed edeutsch closed 6 years ago

edeutsch commented 6 years ago

The new QuestionTranslator gets wedged on this question:

What is the clinical outcome pathway of dicumarol for treatment of coagulation?

dkoslicki commented 6 years ago

Doesn't appear to get stuck in the actual translation of the question:

txltr = QuestionTranslator.QuestionTranslator()
txltr.find_question_parameters("What is the clinical outcome pathway of dicumarol for treatment of coagulation")
Out[591]: 
{'corpus_index': 2,
 'error_code': 'missing_term',
 'error_message': 'Sorry, I was unable to find the appropriate terms to answer your question. Missing term(s):\nA disease (eg. diphtheritic cystitis, pancreatic endocrine carcinoma, malaria, clear cell sarcoma, etc.)',
 'input_text': 'What is the clinical outcome pathway of dicumarol for treatment of coagulation',
 'terms': {'disease_name': None, 'drug_name': 'dicumarol'}}

It's just that coagulation isn't a disease, and COP's need a drug and a disease

dkoslicki commented 6 years ago

@edeutsch Any updates for this issue?

edeutsch commented 6 years ago

you're right, this is definitely a response parsing issue, not a stuck issue. I haven't tried to work on the solution yet. been working on the response API. But I will try to fix it tonight.

edeutsch commented 6 years ago

@dkoslicki , so while trying to sleuth this out, I just tried this: in the file QuestionTranslator.py, in main(): When I run: $ python3 QuestionTranslator.py with: question = {"language": "English", "text": "What is the clinical outcome pathway of physostigmine for treatment of glaucoma"} I get: [{'terms': ['DOID:1686', 'physostigmine'], 'originalQuestion': 'What is the clinical outcome pathway of physostigmine for treatment of glaucoma', 'restatedQuestion': 'What is the clinical outcome pathway of CHEMBL94 for the treatment of glaucoma', 'knownQueryTypeId': 'Q2'}]

but when I run with: question = {"language": "English", "text": "What is the clinical outcome pathway of dicumarol for treatment of coagulation"} I get: Traceback (most recent call last): File "QuestionTranslator.py", line 722, in main() File "QuestionTranslator.py", line 716, in main res = txltr.translate(question) File "QuestionTranslator.py", line 581, in translate return self.format_answer(results_dict, logging=logging) File "QuestionTranslator.py", line 201, in format_answer "restatedQuestion": "%s" % self.restate_question(corpus_index, terms), "originalQuestion": input_text}] File "QuestionTranslator.py", line 139, in restate_question restated = "What is the clinical outcome pathway of %s for the treatment of %s" % (names2descrip[terms["drug_name"]], names2descrip[terms["disease_name"]]) KeyError: None

I am uncertain if this is the problem, but it seems like a problem? Or am I barking up the wrong tree?

dkoslicki commented 6 years ago

@edeutsch Yeah, I was incorrectly assuming a drug and a disease were given. Fix pushed to NewStdAPI branch.

edeutsch commented 6 years ago

This is still causing problems. So now the result is: [{'originalQuestion': 'What is the clinical outcome pathway of dicumarol for treatment of coagulation', 'knownQueryTypeId': 'Q2', 'message': 'Sorry, I was unable to find the appropriate terms to answer your question. Missing term(s):\nA disease (eg. diphtheritic cystitis, pancreatic endocrine carcinoma, malaria, clear cell sarcoma, etc.)', 'restatedQuestion': 'What is the clinical outcome pathway of CHEMBL1466 for the treatment of ?'}]

The way my code currently works is: if there is a specified knownQueryTypeId, then the result get forwarded to than query type handler. So I suggest that if the QuestionTranslator doesn't feel that there is enough information to forward to a QX handler, then knownQueryTypeId should be None/null.

Does that seem like a good rule? Or what should be the signal that the translated question should be forwarded to a QX handler?

Maybe we need to come up with a flow diagram for the QuestionTranslator

dkoslicki commented 6 years ago

Yeah, a flow diagram would be helpful. Maybe as a condition to be passed to QX handler would be:

  1. known query type ID
  2. all terms not None It would seem odd to set the query type ID to None when it's actually known (but that could be a workaround for the time being).
edeutsch commented 6 years ago

okay, i suppose it makes sense to me to set the query type ID to None if there isn't enough information to send it to the reasoner, and only an error message should be displayed. But it's not black and white.

I wonder if there will ever be a case where a query type is ready to be sent to a reasoner with one of the terms set to None. Maybe some queries will have optional terms? I don't know, but seems possible maybe?

What is the clinical outcome pathway of physostigmine for treatment of glaucoma, excluding fructose metabolism?

dkoslicki commented 6 years ago

Hmmm... good point! Maybe we just add a new key to the dict called something like OkToSend? Ugly, but at least obvious what it does.

edeutsch commented 6 years ago

while compiling the spreadsheet I did notice that if there is a message, then it is not ready to send. So I could change the logic to say: if there is a message, then halt and print the message rather than proceed. Does that seem like a good rule for now?

dkoslicki commented 6 years ago

@edeutsch Yeah, that seems like a good way to go!

dkoslicki commented 6 years ago

@edeutsch Closing this issue in favor of #52