Closed hippalectryon-0 closed 1 year ago
+1 I'd like this as well
Right now I made my own eval method (taken inspiriation off of the call method) https://github.com/marella/ctransformers/blob/d05a4d0702c72c028870e4fe5d4f37bf73d7b243/ctransformers/llm.py#L263
something like this is what im doing
tokens = self.tokenize(prompt)
stop = genkwargs.pop("stop", None) or []
if isinstance(stop, str):
stop = [stop]
end_ids = [self.model.tokenize(x) for x in stop]
def should_stop(response_tokens):
for end in end_ids:
if all(x == y for x, y in zip(response_tokens[-len(end):], end)):
return True
if len(response_tokens) >= max_new_tokens:
return True
return False
response = []
for token in self.generate(tokens, **genkwargs):
response.append(token)
if should_stop(response):
break
I agree with @hippalectryon-0 It'd be nice if it were built in~
Thanks for the suggestion. I will add it in the next release.
Added stop
option in the latest release 0.1.1
In core library, you can use:
llm = AutoModelForCausalLM.from_pretrained(...)
llm(prompt, stop=['foo', 'bar'])
In LangChain, you can use:
config = {'stop': ['foo', 'bar']}
llm = CTransformers(..., config=config)
@hippalectryon-0 in the issue referenced above: by streaming if you mean callback API then it is already supported:
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
llm = CTransformers(..., callbacks=[StreamingStdOutCallbackHandler()])
Please feel free to open another issue if you are looking for a different kind of stream API.
(In the readme at least) the config passed in CTransformers doesn't accept stop strings, which is a common feature.