bentoml / OpenLLM

Run any open-source LLMs, such as Llama, Mistral, as OpenAI compatible API endpoint in the cloud.
https://bentoml.com
Apache License 2.0
10.16k stars 642 forks source link

bug: Incorrect return type for Dolly-v2 model #410

Closed ABHISHEK03312 closed 1 year ago

ABHISHEK03312 commented 1 year ago

Describe the bug

In dolly_v2 configuration the return statement seems to be looking for the key "generated_key" in the first element of the result. However no such key exists since the returned results is a string.

What seems to work is changing the return statement in configuration_dolly_v2.py from -> return generation_result[0]['generated_text'] to -> return generation_result[0]

To reproduce

No response

Logs

No response

Environment

bentoml openllm requests python=3.10

System information (Optional)

No response

aarnphm commented 1 year ago

I have now unified the generation and dolly-v2 should just yield out text the same way as gpt-neox model do

For reference, I recommend using mistral models from now on, with the following

import openllm

llm = openllm.LLM('HuggingFaceH4/zephyr-7b-alpha')

await llm.generate("The time in San Francisco is")