Closed qaz-t closed 1 year ago
You will need to provide model_version
from local for now.
I will take a look into support local model more concretely. We have a different issue that is tracking local path
~oh actually this seems to be like a bug. Will fix it~
~Edit: this is not a bug, sorry~
Edit 2: This is a bug, I will release a fixes shortly.
Describe the bug
I'm using conda to create env with python 3.10.12, and install related package using
when i start a llama service using
it works fine in openllm==0.3.9 However, an error occurred in version 0.4.0 and 0.4.1, i've tried
-hf
-gptq
-awq
model from TheBloke in huggingface, and got the same.To reproduce
install requirements
start server
Logs
Environment
Environment variable
System information
bentoml
: 1.1.9python
: 3.10.12platform
: Linux-5.4.0-166-generic-x86_64-with-glibc2.31uid_gid
: 53201113:53200513conda
: 23.9.0in_conda_env
: Trueconda_packages
```yaml name: openllm channels: - defaults dependencies: - _libgcc_mutex=0.1=main - _openmp_mutex=5.1=1_gnu - bzip2=1.0.8=h7b6447c_0 - ca-certificates=2023.08.22=h06a4308_0 - ld_impl_linux-64=2.38=h1181459_1 - libffi=3.4.4=h6a678d5_0 - libgcc-ng=11.2.0=h1234567_1 - libgomp=11.2.0=h1234567_1 - libstdcxx-ng=11.2.0=h1234567_1 - libuuid=1.41.5=h5eee18b_0 - ncurses=6.4=h6a678d5_0 - openssl=3.0.12=h7f8727e_0 - pip=23.3=py310h06a4308_0 - python=3.10.12=h955ad1f_0 - readline=8.2=h5eee18b_0 - setuptools=68.0.0=py310h06a4308_0 - sqlite=3.41.2=h5eee18b_0 - tk=8.6.12=h1ccaba5_0 - wheel=0.41.2=py310h06a4308_0 - xz=5.4.2=h5eee18b_0 - zlib=1.2.13=h5eee18b_0 - pip: - accelerate==0.24.1 - aiohttp==3.8.6 - aiosignal==1.3.1 - anyio==3.7.1 - appdirs==1.4.4 - asgiref==3.7.2 - async-timeout==4.0.3 - attrs==23.1.0 - bentoml==1.1.9 - bitsandbytes==0.41.2.post1 - build==1.0.3 - cattrs==23.1.2 - certifi==2023.7.22 - charset-normalizer==3.3.2 - circus==0.18.0 - click==8.1.7 - click-option-group==0.5.6 - cloudpickle==3.0.0 - cmake==3.27.7 - coloredlogs==15.0.1 - contextlib2==21.6.0 - cuda-python==12.3.0 - datasets==2.14.6 - deepmerge==1.1.0 - deprecated==1.2.14 - dill==0.3.7 - exceptiongroup==1.1.3 - fairscale==0.4.13 - fastapi==0.104.1 - fastcore==1.5.29 - filelock==3.13.1 - filetype==1.2.0 - frozenlist==1.4.0 - fs==2.4.16 - fsspec==2023.10.0 - ghapi==1.0.4 - h11==0.14.0 - httpcore==1.0.1 - httptools==0.6.1 - httpx==0.25.1 - huggingface-hub==0.17.3 - humanfriendly==10.0 - idna==3.4 - importlib-metadata==6.8.0 - inflection==0.5.1 - jinja2==3.1.2 - jsonschema==4.19.2 - jsonschema-specifications==2023.7.1 - lit==17.0.4 - markdown-it-py==3.0.0 - markupsafe==2.1.3 - mdurl==0.1.2 - mpmath==1.3.0 - msgpack==1.0.7 - multidict==6.0.4 - multiprocess==0.70.15 - mypy-extensions==1.0.0 - networkx==3.2.1 - ninja==1.11.1.1 - numpy==1.26.1 - nvidia-cublas-cu11==11.10.3.66 - nvidia-cuda-cupti-cu11==11.7.101 - nvidia-cuda-nvrtc-cu11==11.7.99 - nvidia-cuda-runtime-cu11==11.7.99 - nvidia-cudnn-cu11==8.5.0.96 - nvidia-cufft-cu11==10.9.0.58 - nvidia-curand-cu11==10.2.10.91 - nvidia-cusolver-cu11==11.4.0.1 - nvidia-cusparse-cu11==11.7.4.91 - nvidia-ml-py==11.525.150 - nvidia-nccl-cu11==2.14.3 - nvidia-nvtx-cu11==11.7.91 - openllm==0.4.1 - openllm-client==0.4.1 - openllm-core==0.4.1 - opentelemetry-api==1.20.0 - opentelemetry-instrumentation==0.41b0 - opentelemetry-instrumentation-aiohttp-client==0.41b0 - opentelemetry-instrumentation-asgi==0.41b0 - opentelemetry-sdk==1.20.0 - opentelemetry-semantic-conventions==0.41b0 - opentelemetry-util-http==0.41b0 - optimum==1.14.0 - orjson==3.9.10 - packaging==23.2 - pandas==2.1.2 - pathspec==0.11.2 - pillow==10.1.0 - pip-requirements-parser==32.0.1 - pip-tools==7.3.0 - prometheus-client==0.18.0 - protobuf==4.25.0 - psutil==5.9.6 - pyarrow==14.0.1 - pydantic==1.10.13 - pygments==2.16.1 - pyparsing==3.1.1 - pyproject-hooks==1.0.0 - python-dateutil==2.8.2 - python-dotenv==1.0.0 - python-json-logger==2.0.7 - python-multipart==0.0.6 - pytz==2023.3.post1 - pyyaml==6.0.1 - pyzmq==25.1.1 - ray==2.8.0 - referencing==0.30.2 - regex==2023.10.3 - requests==2.31.0 - rich==13.6.0 - rpds-py==0.12.0 - safetensors==0.4.0 - schema==0.7.5 - scipy==1.11.3 - sentencepiece==0.1.99 - simple-di==0.1.5 - six==1.16.0 - sniffio==1.3.0 - starlette==0.27.0 - sympy==1.12 - tabulate==0.9.0 - tokenizers==0.14.1 - tomli==2.0.1 - torch==2.0.1 - tornado==6.3.3 - tqdm==4.66.1 - transformers==4.35.0 - triton==2.0.0 - typing-extensions==4.8.0 - tzdata==2023.3 - urllib3==2.0.7 - uvicorn==0.24.0.post1 - uvloop==0.19.0 - vllm==0.2.1.post1 - watchfiles==0.21.0 - wcwidth==0.2.9 - websockets==12.0 - wrapt==1.15.0 - xformers==0.0.22 - xxhash==3.4.1 - yarl==1.9.2 - zipp==3.17.0 prefix: /home/user/miniconda3/envs/openllm ```
pip_packages
``` accelerate==0.24.1 aiohttp==3.8.6 aiosignal==1.3.1 anyio==3.7.1 appdirs==1.4.4 asgiref==3.7.2 async-timeout==4.0.3 attrs==23.1.0 bentoml==1.1.9 bitsandbytes==0.41.2.post1 build==1.0.3 cattrs==23.1.2 certifi==2023.7.22 charset-normalizer==3.3.2 circus==0.18.0 click==8.1.7 click-option-group==0.5.6 cloudpickle==3.0.0 cmake==3.27.7 coloredlogs==15.0.1 contextlib2==21.6.0 cuda-python==12.3.0 datasets==2.14.6 deepmerge==1.1.0 Deprecated==1.2.14 dill==0.3.7 exceptiongroup==1.1.3 fairscale==0.4.13 fastapi==0.104.1 fastcore==1.5.29 filelock==3.13.1 filetype==1.2.0 frozenlist==1.4.0 fs==2.4.16 fsspec==2023.10.0 ghapi==1.0.4 h11==0.14.0 httpcore==1.0.1 httptools==0.6.1 httpx==0.25.1 huggingface-hub==0.17.3 humanfriendly==10.0 idna==3.4 importlib-metadata==6.8.0 inflection==0.5.1 Jinja2==3.1.2 jsonschema==4.19.2 jsonschema-specifications==2023.7.1 lit==17.0.4 markdown-it-py==3.0.0 MarkupSafe==2.1.3 mdurl==0.1.2 mpmath==1.3.0 msgpack==1.0.7 multidict==6.0.4 multiprocess==0.70.15 mypy-extensions==1.0.0 networkx==3.2.1 ninja==1.11.1.1 numpy==1.26.1 nvidia-cublas-cu11==11.10.3.66 nvidia-cuda-cupti-cu11==11.7.101 nvidia-cuda-nvrtc-cu11==11.7.99 nvidia-cuda-runtime-cu11==11.7.99 nvidia-cudnn-cu11==8.5.0.96 nvidia-cufft-cu11==10.9.0.58 nvidia-curand-cu11==10.2.10.91 nvidia-cusolver-cu11==11.4.0.1 nvidia-cusparse-cu11==11.7.4.91 nvidia-ml-py==11.525.150 nvidia-nccl-cu11==2.14.3 nvidia-nvtx-cu11==11.7.91 openllm==0.4.1 openllm-client==0.4.1 openllm-core==0.4.1 opentelemetry-api==1.20.0 opentelemetry-instrumentation==0.41b0 opentelemetry-instrumentation-aiohttp-client==0.41b0 opentelemetry-instrumentation-asgi==0.41b0 opentelemetry-sdk==1.20.0 opentelemetry-semantic-conventions==0.41b0 opentelemetry-util-http==0.41b0 optimum==1.14.0 orjson==3.9.10 packaging==23.2 pandas==2.1.2 pathspec==0.11.2 Pillow==10.1.0 pip-requirements-parser==32.0.1 pip-tools==7.3.0 prometheus-client==0.18.0 protobuf==4.25.0 psutil==5.9.6 pyarrow==14.0.1 pydantic==1.10.13 Pygments==2.16.1 pyparsing==3.1.1 pyproject_hooks==1.0.0 python-dateutil==2.8.2 python-dotenv==1.0.0 python-json-logger==2.0.7 python-multipart==0.0.6 pytz==2023.3.post1 PyYAML==6.0.1 pyzmq==25.1.1 ray==2.8.0 referencing==0.30.2 regex==2023.10.3 requests==2.31.0 rich==13.6.0 rpds-py==0.12.0 safetensors==0.4.0 schema==0.7.5 scipy==1.11.3 sentencepiece==0.1.99 simple-di==0.1.5 six==1.16.0 sniffio==1.3.0 starlette==0.27.0 sympy==1.12 tabulate==0.9.0 tokenizers==0.14.1 tomli==2.0.1 torch==2.0.1 tornado==6.3.3 tqdm==4.66.1 transformers==4.35.0 triton==2.0.0 typing_extensions==4.8.0 tzdata==2023.3 urllib3==2.0.7 uvicorn==0.24.0.post1 uvloop==0.19.0 vllm==0.2.1.post1 watchfiles==0.21.0 wcwidth==0.2.9 websockets==12.0 wrapt==1.15.0 xformers==0.0.22 xxhash==3.4.1 yarl==1.9.2 zipp==3.17.0 ```
System information (Optional)
No response