marella / ctransformers

Python bindings for the Transformer models implemented in C/C++ using GGML library.
MIT License
1.79k stars 135 forks source link

How to specify Maximum Context Length for my llm #144

Open Harri1703 opened 11 months ago

alifatmi commented 11 months ago

you can use this code to increase maximun context length for your llm config = {'max_new_tokens': 256, 'repetition_penalty': 1.1,'context_length':1000}

llm = CTransformers(model='marella/gpt-2-ggml', config=config) for more infromation you can check below links

https://python.langchain.com/docs/integrations/providers/ctransformers

https://github.com/marella/ctransformers#config

sawradip commented 10 months ago

If you are loading directly from Huggingface:

from ctransformers import AutoModelForCausalLM

llm = AutoModelForCausalLM.from_pretrained("TheBloke/zephyr-7B-beta-GGUF", gpu_layers=50)

llm.config.context_length = 8192