Closed flotos closed 11 months ago
The max_seq_len was not used when calling generator.generate_simple() making it impossible to use with prompt bigger than 2048 tokens, for example on new Llama 2 that have 4096 context size.
generator.generate_simple()
Yep, this looks like an oversight.
The max_seq_len was not used when calling
generator.generate_simple()
making it impossible to use with prompt bigger than 2048 tokens, for example on new Llama 2 that have 4096 context size.