Hello, is it possible to add "max new tokens" in the settings to change the length of the response from LLM? In some compatible api openai, there is a default limit, for example, in the response 400 tokens, but if you specify, for example, as in SillyTavern there is a setting Response (tokens) 2000 tokens, then the length of the response will match and will not be cut off
Hello, is it possible to add "max new tokens" in the settings to change the length of the response from LLM? In some compatible api openai, there is a default limit, for example, in the response 400 tokens, but if you specify, for example, as in SillyTavern there is a setting Response (tokens) 2000 tokens, then the length of the response will match and will not be cut off