Closed nicpopovic closed 1 year ago
COMPLETION_MAX_PROMPT
is just a safety measure to handle worst-case scenarios. Since COMPLETION_MAX_TOKENS
can only ensure that the number of newly generated tokens is controllable, another mechanism is needed to prevent the input side from being maliciously exploited.
We recommend setting COMPLETION_MAX_PROMPT
to a large value, even up to the maximum length that a model can effectively handle. That's also why we think the need for a warning is not very significant.
Makes sense, though I would suggest reconsidering adding a warning: With the default value of 4096, an example prompt I was using was being truncated to approx. 700 tokens, which is not noticeable if you are not echoing prompt tokens. The only way to notice, of course, is that suddenly the generated output is worse than expected.
One quick way of adding some level of transparency could be to add a character count to the playground input field, although this does not solve this issue for the API.
Perhaps a workaround is to increase the default to a large value (e.g. 16384
?), which can avoid surprising users.
Warning or throwing errors may break compatibility with the OpenAI API, as their documentation does not provide detailed examples for corner cases. We have to manually try them out, which can be literally very expensive.
Choosing to set a larger default value also aligns with the overall design of other environment variables, such as allowing CORS by default and listening on 0.0.0.0
. This allows users who care about being too permissive to set stricter options, while ensuring that most people can use the tool out of the box.
Also, we will thoroughly revamp the playground interface soon. The current implementation based on EventSource
is also limiting the prompt length due to URL encoding. We will switch to using POST requests in the future, which is particularly important for adding the chat interface.
Sounds good, thanks for all your work on this project, I'm really liking it :)
Hi, I noticed that COMPLETION_MAX_PROMPT is defined via the length of the prompt in characters, rather than tokens, and was wondering if this is intended? If intentional, it may be worth clarifying somewhere (as currently the default value is identical to that of COMPLETION_MAX_TOKENS) and/or adding a warning when a prompt is being truncated.