We should store along with all responses the logprobs so we can see places where the certainty was low. Hopefully these correlate with hallucinations or other types of undesirable behaviour, and a recovery action can be triggered when this is detected.
We should store along with all responses the logprobs so we can see places where the certainty was low. Hopefully these correlate with hallucinations or other types of undesirable behaviour, and a recovery action can be triggered when this is detected.