Running the "Write unit tests" command with a local Llama 2 model I get an error message because of the default 1000 max_tokens param:
llama_predict: error: prompt is too long (1133 tokens, max 1020)
I would like to be able to set the context window size of the model (Llama 2 is 4096 tokens). This way the max_tokensparam value could be automatically calculated using the llama-tokenizer-js lib:
Running the "Write unit tests" command with a local Llama 2 model I get an error message because of the default 1000 max_tokens param:
I would like to be able to set the context window size of the model (Llama 2 is 4096 tokens). This way the
max_tokens
param value could be automatically calculated using the llama-tokenizer-js lib:[Edit]: it would need another tokenizer for the OpenAi, this one is for local models