Closed bvandorf closed 1 year ago
Usage: openllm build [OPTIONS] {flan-t5|dolly-v2|chatglm|starcoder|falcon|stablelm|opt|mpt}
Package a given models into a Bento.
$ openllm build flan-t5 --model-id google/flan-t5-large
> NOTE: To run a container built from this Bento with GPU support, make sure
> to have https://github.com/NVIDIA/nvidia-container-toolkit install locally.
Options:
--model-id TEXT Optional model_id name or path for (fine-tune) weight.
-o, --output [json|pretty|porcelain]
Showing output type. [env var: OPENLLM_OUTPUT; default: pretty]
--overwrite Overwrite existing Bento for given LLM if it already exists.
--workers-per-resource FLOAT Number of workers per resource assigned. See
https://docs.bentoml.org/en/latest/guides/scheduling.html#resource-scheduling-
strategy for more information. By default, this is set to 1.
NOTE: The workers value passed into 'build' will determine how the LLM can be
provisioned in Kubernetes as well as in standalone container. This will ensure it
has the same effect with 'openllm start --workers ...'
Optimisation options.: [mutually_exclusive]
--quantize [int8|int4|gptq] Set quantization mode for serving in deployment.
GPTQ is currently working in progress and will be available soon.
NOTE: Quantization is only available for PyTorch models.
--bettertransformer Apply FasterTransformer wrapper to serve model. This will applies during serving
time.
--enable-features FEATURE[,FEATURE]
Enable additional features for building this LLM Bento. Available: mpt, fine-tune,
chatglm, agents, flan-t5, playground, starcoder, openai, falcon
--adapter-id [PATH | [remote/][adapter_name:]adapter_id][, ...]
Optional adapters id to be included within the Bento. Note that if you are using
relative path, '--build-ctx' must be passed.
--build-ctx TEXT Build context. This is required if --adapter-id uses relative path
--model-version TEXT Model version provided for this 'model-id' if it is a custom path.
--dockerfile-template FILENAME Optional custom dockerfile template to be used with this BentoLLM.
Miscellaneous options:
-q, --quiet Suppress all output.
--debug, --verbose Print out debug logs.
--do-not-track Do not send usage info
-h, --help Show this message and exit.
Sorry for the late reply, but any updates on this? Feel free to reopen if you still running into this issue.
I can build mpt with openllm build
(tested on linux and mac)
Describe the bug
mpt is listed under supported models but not available in openllm build command, this is the error message.
openllm build mpt Usage: openllm build [OPTIONS] {flan-t5|dolly-v2|chatglm|starcoder|falcon|stablelm|opt} Try 'openllm build -h' for help.
Error: Invalid value for '{flan-t5|dolly-v2|chatglm|starcoder|falcon|stablelm|opt}': 'mpt' is not one of 'flan-t5', 'dolly-v2', 'chatglm', 'starcoder', 'falcon', 'stablelm', 'opt'.
To reproduce
pip install "openllm[mpt]" openllm build mpt
Logs
No response
Environment
bentoml env
Environment variable
System information
bentoml
: 1.0.23python
: 3.11.0platform
: Windows-10-10.0.19045-SP0is_window_admin
: Truepip_packages
``` absl-py==1.4.0 accelerate==0.20.3 aiofiles==23.1.0 aiohttp==3.8.4 aiolimiter==1.1.0 aiosignal==1.3.1 altair==5.0.1 anyio==3.6.2 appdirs==1.4.4 asgiref==3.7.2 asttokens==2.2.1 astunparse==1.6.3 async-timeout==4.0.2 asyncio==3.4.3 attrs==23.1.0 azure-core==1.27.0 azure-cosmos==4.3.1 azureml==0.2.7 backcall==0.2.0 beautifulsoup4==4.12.2 bentoml==1.0.23 bitsandbytes==0.39.0 blinker==1.6.2 Brotli==1.0.9 bs4==0.0.1 build==0.10.0 cachetools==5.3.1 cattrs==23.1.2 certifi==2022.12.7 cffi==1.15.1 charset-normalizer==3.1.0 circus==0.18.0 click==8.1.3 click-option-group==0.5.6 cloudpickle==2.2.1 colorama==0.4.6 coloredlogs==15.0.1 comm==0.1.2 contextlib2==21.6.0 contourpy==1.0.7 cryptography==41.0.1 cycler==0.11.0 dataclasses-json==0.5.8 datasets==2.13.0 debugpy==1.6.6 decorator==5.1.1 deepmerge==1.1.0 Deprecated==1.2.14 dill==0.3.6 diskcache==5.6.1 duckduckgo-search==3.8.3 einops==0.6.1 executing==1.2.0 faiss-cpu==1.7.4 fastapi==0.95.1 ffmpy==0.3.0 filelock==3.12.2 filetype==1.2.0 Flask==2.3.2 Flask-SQLAlchemy==3.0.5 flatbuffers==23.5.26 flexgen==0.1.7 fonttools==4.39.4 frozenlist==1.3.3 fs==2.4.16 fsspec==2023.6.0 gast==0.4.0 google-auth==2.19.1 google-auth-oauthlib==1.0.0 google-pasta==0.2.0 gptcache==0.1.32 gradio==3.33.1 gradio_client==0.2.5 greenlet==2.0.2 grpcio==1.54.2 grpcio-health-checking==1.48.2 guidance==0.0.63 h11==0.14.0 h2==4.1.0 h5py==3.8.0 hpack==4.0.0 httpcore==0.17.2 httpx==0.24.1 huggingface-hub==0.15.1 humanfriendly==10.0 hyperframe==6.0.1 idna==3.4 importlib-metadata==6.0.1 inflection==0.5.1 ipykernel==6.21.3 ipython==8.11.0 itsdangerous==2.1.2 jaconv==0.3.4 jamo==0.4.1 jax==0.4.12 jedi==0.18.2 Jinja2==3.1.2 joblib==1.2.0 jsonschema==4.17.3 jupyter_client==8.0.3 jupyter_core==5.2.0 keras==2.12.0 kiwisolver==1.4.4 langchain==0.0.196 langchainplus-sdk==0.0.11 libclang==16.0.0 linkify-it-py==2.0.2 llama-cpp-python==0.1.62 lxml==4.9.2 Markdown==3.4.3 markdown-it-py==2.2.0 MarkupSafe==2.1.3 marshmallow==3.19.0 marshmallow-enum==1.5.1 matplotlib==3.7.1 matplotlib-inline==0.1.6 mdit-py-plugins==0.3.3 mdurl==0.1.2 ml-dtypes==0.2.0 mpmath==1.3.0 msal==1.22.0 msgpack==1.0.5 multidict==6.0.4 multiprocess==0.70.14 mypy-extensions==1.0.0 nest-asyncio==1.5.6 networkx==3.0 numexpr==2.8.4 numpy==1.23.5 oauthlib==3.2.2 openai==0.27.8 openapi-schema-pydantic==1.2.4 openllm==0.1.17 opentelemetry-api==1.17.0 opentelemetry-instrumentation==0.38b0 opentelemetry-instrumentation-aiohttp-client==0.38b0 opentelemetry-instrumentation-asgi==0.38b0 opentelemetry-instrumentation-grpc==0.38b0 opentelemetry-sdk==1.17.0 opentelemetry-semantic-conventions==0.38b0 opentelemetry-util-http==0.38b0 opt-einsum==3.3.0 optimum==1.9.0 orjson==3.9.1 packaging==23.0 pandas==2.0.1 parso==0.8.3 pathspec==0.11.1 peft @ git+https://github.com/huggingface/peft@03eb378eb914fbee709ff7c86ba5b1d033b89524 pesq==0.0.4 pickleshare==0.7.5 pika==1.3.2 Pillow==9.5.0 pip-requirements-parser==32.0.1 pip-tools==6.13.0 platformdirs==3.1.1 playwright==1.35.0 prometheus-client==0.17.0 prompt-toolkit==3.0.38 protobuf==3.20.3 psutil==5.9.4 PuLP==2.7.0 pure-eval==0.2.2 pyarrow==12.0.1 pyasn1==0.5.0 pyasn1-modules==0.3.0 pycparser==2.21 pydantic==1.10.7 pydocumentdb==2.3.5 pydub==0.25.1 pyee==9.0.4 Pygments==2.14.0 pygtrie==2.5.0 PyJWT==2.7.0 PyMySQL==1.1.0 pynvml==11.5.0 pyparsing==3.0.9 pyproject_hooks==1.0.0 pyre-extensions==0.0.29 pyreadline3==3.4.1 pyrsistent==0.19.3 python-dateutil==2.8.2 python-json-logger==2.0.7 python-multipart==0.0.6 pytz==2023.3 pywin32==305 PyYAML==6.0 pyzmq==25.0.0 regex==2023.6.3 requests==2.29.0 requests-oauthlib==1.3.1 rich==13.4.2 rsa==4.9 safetensors==0.3.1 schema==0.7.5 scikit-learn==1.2.2 scipy==1.10.1 semantic-version==2.10.0 sentencepiece==0.1.99 simple-di==0.1.5 six==1.16.0 sniffio==1.3.0 socksio==1.0.0 soupsieve==2.4.1 SQLAlchemy==2.0.16 stack-data==0.6.2 starlette==0.26.1 sympy==1.12 tabulate==0.9.0 tenacity==8.2.2 tensorboard==2.12.3 tensorboard-data-server==0.7.0 tensorflow==2.12.0 tensorflow-estimator==2.12.0 tensorflow-intel==2.12.0 tensorflow-io-gcs-filesystem==0.31.0 termcolor==2.3.0 threadpoolctl==3.1.0 tiktoken==0.4.0 tokenizers==0.13.3 toolz==0.12.0 torch==2.0.1+cu117 torchaudio==2.0.2+cu117 torchvision==0.15.2+cu117 tornado==6.2 tqdm==4.65.0 traitlets==5.9.0 transformers==4.30.2 typing-inspect==0.9.0 typing_extensions==4.5.0 tzdata==2023.3 uc-micro-py==1.0.2 urllib3==1.26.15 uvicorn==0.22.0 watchfiles==0.19.0 wcwidth==0.2.6 websockets==11.0.3 Werkzeug==2.3.6 wrapt==1.14.1 xformers==0.0.20 xxhash==3.2.0 yarl==1.9.2 zipp==3.15.0 ```
System information (Optional)
No response