There many of places where we depended on the string literal "cannot answer" in the qa prompt, mainly the environment being done (prior to https://github.com/Future-House/paper-qa/pull/684) or the answer being considered unsure.
This environment check of "cannot answer" also has some downsides:
A coupling of environment functionality to a caller-specified qa prompt
We don't validate that "cannot answer" is present in the qa prompt
Added statefulness to our environment (checks for a string literal in answer)
So, what is "unsure"? Really it should be:
gen_answer tool call updates the answer
Given the answer, the agent (and not the environment's gen_answer tool) decides if an answer was successful
If not successful, agent keeps trying to get better evidence, until it gives up
To resolve this, we moved the unsure call directly to the complete tool. Now:
When unsure: agent just keeps running
When finally sure: agent calls complete(has_successful_answer=True)
If giving up: agent calls complete(has_successful_answer=False)
The "cannot answer" check was mostly easy to remove, other than the AnswerSettings.wipe_context_on_answer_failure, since we no longer have a way of checking unsure within gen_answer.
Since the agent controls unsureness now, we needed to make a new tool: reset, which basically performs the use case of wipe_context_on_answer_failure.
After this PR, we have:
Removed dependence on "cannot answer" string literal
There many of places where we depended on the string literal
"cannot answer"
in theqa
prompt, mainly the environment being done (prior to https://github.com/Future-House/paper-qa/pull/684) or theanswer
being considered unsure.This environment check of
"cannot answer"
also has some downsides:qa
prompt"cannot answer"
is present in theqa
promptanswer
)So, what is "unsure"? Really it should be:
gen_answer
tool call updates the answergen_answer
tool) decides if an answer was successfulTo resolve this, we moved the unsure call directly to the
complete
tool. Now:complete(has_successful_answer=True)
complete(has_successful_answer=False)
The
"cannot answer"
check was mostly easy to remove, other than theAnswerSettings.wipe_context_on_answer_failure
, since we no longer have a way of checking unsure withingen_answer
.Since the agent controls unsureness now, we needed to make a new tool:
reset
, which basically performs the use case ofwipe_context_on_answer_failure
.After this PR, we have:
"cannot answer"
string literalAnswerSettings.wipe_context_on_answer_failure
gen_answer
tool