bentoml / OpenLLM

Run any open-source LLMs, such as Llama, Mistral, as OpenAI compatible API endpoint in the cloud.
https://bentoml.com
Apache License 2.0
10.16k stars 642 forks source link

Cannot build bento from openllm model #408

Closed martinmr closed 1 year ago

martinmr commented 1 year ago

We have a makefile to invoke openllm and build a bento, but it's no longer working. It fails with the following error.

# Build the BentoML service.
openllm build flan-t5 --model_id google/flan-t5-large --bento-version a56547cf3e0f2a5716b0ab8b63d72b0cb83967a5
Downloading (…)lve/main/config.json: 100% 662/662 [00:00<00:00, 3.81MB/s]
Downloading (…)okenizer_config.json: 100% 2.54k/2.54k [00:00<00:00, 16.1MB/s]
Downloading spiece.model: 100% 792k/792k [00:00<00:00, 250MB/s]
Downloading (…)/main/tokenizer.json: 100% 2.42M/2.42M [00:00<00:00, 57.0MB/s]
Downloading (…)cial_tokens_map.json: 100% 2.20k/2.20k [00:00<00:00, 15.0MB/s]
Downloading (…)neration_config.json: 100% 147/147 [00:00<00:00, 956kB/s]
Downloading model.safetensors: 100% 3.13G/3.13G [00:07<00:00, 398MB/s]
Fetching 7 files: 100% 7/7 [00:08<00:00,  1.16s/it]:07<00:00, 431MB/s]
Building Bento for 'flan-t5'
BentoML will not install Python to custom base images; ensure the base image 'public.ecr.aws/y5w8i4y6/bentoml/openllm:0.3.6' has Python installed.

 ██████╗ ██████╗ ███████╗███╗   ██╗██╗     ██╗     ███╗   ███╗
██╔═══██╗██╔══██╗██╔════╝████╗  ██║██║     ██║     ████╗ ████║
██║   ██║██████╔╝█████╗  ██╔██╗ ██║██║     ██║     ██╔████╔██║
██║   ██║██╔═══╝ ██╔══╝  ██║╚██╗██║██║     ██║     ██║╚██╔╝██║
╚██████╔╝██║     ███████╗██║ ╚████║███████╗███████╗██║ ╚═╝ ██║
 ╚═════╝ ╚═╝     ╚══════╝╚═╝  ╚═══╝╚══════╝╚══════╝╚═╝     ╚═╝

Successfully built Bento(tag="google--flan-t5-large-service:a56547cf3e0f2a5716b0ab8b63d72b0cb83967a5").
📖 Next steps:

* Push to BentoCloud with 'bentoml push':
        $ bentoml push google--flan-t5-large-service:a56547cf3e0f2a5716b0ab8b63d72b0cb83967a5

* Containerize your Bento with 'bentoml containerize':
        $ bentoml containerize google--flan-t5-large-service:a56547cf3e0f2a5716b0ab8b63d72b0cb83967a5 --opt progress=plain

        Tip: To enable additional BentoML features for 'containerize', use '--enable-features=FEATURE[,FEATURE]' [see 'bentoml containerize -h' for more advanced usage]

# Build the docker image.
bentoml containerize google-flan-t5-large-service:a56547cf3e0f2a5716b0ab8b63d72b0cb83967a5 --opt progress=plain --image-tag phiis:latest
Error: [bentoml-cli] `containerize` failed: Bento 'google-flan-t5-large-service:a56547cf3e0f2a5716b0ab8b63d72b0cb83967a5' is not found in BentoML store <osfs '/home/circleci/bentoml/bentos'>, you may need to run `bentoml models pull` first
make: *** [Makefile:15: build] Error 1

I tried to add the bentoml models pull to the makefile, but it complains about no config file. I am not sure if that was the right step anyways.

How do I get bento containerize to work again?

aarnphm commented 1 year ago

Are you running this on a mac m1? If that's the case you need --opt platform=linux/amd64

martinmr commented 1 year ago

No, I am running it in circlci. I'll check what platform is that.

martinmr commented 1 year ago

It's running on an image based on Ubuntu. Adding that argument didn't fix it.

martinmr commented 1 year ago

@aarnphm Found the issue, a dash was added to the image name. Instead of google-flan-t5-large-service as in the past, the image name is now google--flan-t5-large-service (extra dash between google and flan). Was that intentional?

aarnphm commented 1 year ago

yes I did change this in 0.3. I do think I did mention this somewhere in the changelog or the release notes?

martinmr commented 1 year ago

Ok. I am good then.