turboderp / exllama

A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
MIT License
2.74k stars 215 forks source link

Can max_seq_len be set via CLI or GUI in webui? #240

Closed int19h closed 1 year ago

turboderp commented 1 year ago

Yes, run it with -l 4096 or whatever. Or -h for a list of available parameters. It can't be configured in the web UI at the moment because all the buffers that depend on sequence length are pre-allocated at initialization.

int19h commented 1 year ago

Thank you!