Closed ghost closed 10 months ago
It is possible, I only inserted the ones on which I tested and under which I made prompts, but as llm in these scripts you can use any llm model.
You only replace these two lines:
model_url = "link_to_llm_model"
model="models/model_name"
The first line is at the top of every script, the second is inside llm variable:
llm = CTransformers(
model="models/model_name", # here
model_type="llama",
gpu_layers=gpu_layers,
config={
"max_new_tokens": 1024,
"repetition_penalty": 1.1,
"top_k": 40,
"top_p": 0.95,
"temperature": 0.8,
"context_length": 8192,
"gpu_layers": gpu_layers,
"stop": [
"/s",
"</s>",
"<s>",
"<|system|>",
"<|assistant|>",
"<|user|>",
"<|char|>",
],
},
)
I noticed you had the models ending in .gguf inside the scripts, does that matter or can I use GPTQ or AWQ models too and is just putting the folder name enough for it to know what to do?
In every script available in the repository, I use the CTransformers library for this. Here is the list of models listed as supported from their repository:
Would it be possible to use better language models with this? I would prefer using something that is 13B as they are far better than 7B ones.