Closed vmajor closed 1 year ago
Code:
import ctransformers
from transformers import AutoTokenizer
name = '/home/*****/models/mpt-30B-instruct-GGML/mpt-30b-instruct.ggmlv0.q8_0.bin'
#config = ctransformers.hub.AutoConfig(name)
model = ctransformers.AutoModelForCausalLM.from_pretrained(
name,
model_type='mpt',
top_k=40,
top_p=0.1,
temperature=0.7,
repetition_penalty=1.18,
last_n_tokens=64,
seed=123,
batch_size=64,
context_length=8192,
max_new_tokens=300
)
context = "Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n###Instruction\n"
prompt = "Read this carefully, and reflect on your answer before you give it: David has three sisters. How many brothers does each sister have?\n"
output = "### Response\n"
formatted_prompt = context + prompt + output
print(model(formatted_prompt))
Never mind, this does not appear to be ctransformers issue, it is just that my mpt seems to react to settings in a different way to llama derivatives. It requires larger jumps, and seems to be the most sensitive to top_p values.
How do I specify model() parameters, from here: https://github.com/marella/ctransformers#config
Placing them inside model() does not raise an error, but they are ignored for my mpt model ie. I can enter whatever I want, output does not change at all.
Am I supposed to enter them under kwargs as a list? Is there an example somewhere?
EDIT: I changed gnenerate() to model(). I had generate() on my mind from another question, but it is the model parameters that I am trying to set