bug: Reports that Claude 3 Opus has higher frequency of early output termination

chillatom commented 6 months ago

Version

N/A

Describe the bug

From slack From discord

Let's investigate and roll any remediation into the output sequence epic

Reports that Opus is terminating without completing the "thought" fully.

Expected behavior

Cody gracefully handles early terminations or allows user to continue chat

Additional context

No response

PriNova commented 6 months ago

In my experience, this happens if you give the LLM not enough room to answer fully. For example, if you constrain an LLM to only 256 output tokens, it will mostly provide only a partial answer if code is involved. Asking Claude for a full implementation of the SURF (Speeded-Up Robust Features) algorithm takes about 1302 tokens.

Cody will of course cut partially. If you want to stay within a token limit, then prepend a hidden prompt to the users' prompt to only answer briefly. Or if it hits some special letters or a stop_token this could happen too.

github-actions[bot] commented 3 months ago

This issue is marked as stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed automatically in 5 days.

sourcegraph / cody