Removed `"cannot answer"` literals and added `reset` tool

There many of places where we depended on the string literal "cannot answer" in the qa prompt, mainly the environment being done (prior to https://github.com/Future-House/paper-qa/pull/684) or the answer being considered unsure.

This environment check of "cannot answer" also has some downsides:

A coupling of environment functionality to a caller-specified qa prompt
- We don't validate that "cannot answer" is present in the qa prompt
Added statefulness to our environment (checks for a string literal in answer)

So, what is "unsure"? Really it should be:

gen_answer tool call updates the answer
Given the answer, the agent (and not the environment's gen_answer tool) decides if an answer was successful
If not successful, agent keeps trying to get better evidence, until it gives up

To resolve this, we moved the unsure call directly to the complete tool. Now:

When unsure: agent just keeps running
When finally sure: agent calls complete(has_successful_answer=True)
If giving up: agent calls complete(has_successful_answer=False)

The "cannot answer" check was mostly easy to remove, other than the AnswerSettings.wipe_context_on_answer_failure, since we no longer have a way of checking unsure within gen_answer.

Since the agent controls unsureness now, we needed to make a new tool: reset, which basically performs the use case of wipe_context_on_answer_failure.

After this PR, we have:

Removed dependence on "cannot answer" string literal
Deprecates AnswerSettings.wipe_context_on_answer_failure
Agent defines unsureness, not the output of the environment's gen_answer tool
A "learnable" dimension, the agent controlling wiping contexts

Future-House / paper-qa

Removed `"cannot answer"` literals and added `reset` tool #698