langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
94.43k stars 15.26k forks source link

Failed to resolve model_id:Could not find model id for inference server #18639

Closed anhnh2002 closed 2 months ago

anhnh2002 commented 8 months ago

Checked other resources

Example Code

from langchain_community.llms import HuggingFaceEndpoint

llm = HuggingFaceEndpoint( endpoint_url="http://0.0.0.0:8080/", max_new_tokens=512, top_k=10, top_p=0.95, typical_p=0.95, temperature=0.01, repetition_penalty=1.03, huggingfacehub_api_token="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" )

from langchain.schema import ( HumanMessage, SystemMessage, ) from langchain_community.chat_models.huggingface import ChatHuggingFace

messages = [ SystemMessage(content="You're a helpful assistant"), HumanMessage( content="What happens when an unstoppable force meets an immovable object?" ), ]

chat_model = ChatHuggingFace(llm=llm)

Error Message and Stack Trace (if applicable)

{ "name": "ValueError", "message": "Failed to resolve model_id:Could not find model id for inference server: http://0.0.0.0:8080/Make sure that your Hugging Face token has access to the endpoint.", "stack": "--------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[25], line 14 5 from langchain_community.chat_models.huggingface import ChatHuggingFace 7 messages = [ 8 SystemMessage(content=\"You're a helpful assistant\"), 9 HumanMessage( 10 content=\"What happens when an unstoppable force meets an immovable object?\" 11 ), 12 ] ---> 14 chat_model = ChatHuggingFace(llm=llm)

File ~/miniconda3/envs/api_mapping/lib/python3.9/site-packages/langchain_community/chat_models/huggingface.py:55, in ChatHuggingFace.init(self, kwargs) 51 super().init(kwargs) 53 from transformers import AutoTokenizer ---> 55 self._resolve_model_id() 57 self.tokenizer = ( 58 AutoTokenizer.from_pretrained(self.model_id) 59 if self.tokenizer is None 60 else self.tokenizer 61 )

File ~/miniconda3/envs/api_mapping/lib/python3.9/site-packages/langchain_community/chat_models/huggingface.py:155, in ChatHuggingFace._resolve_model_id(self) 152 self.model_id = endpoint.repository 154 if not self.model_id: --> 155 raise ValueError( 156 \"Failed to resolve model_id:\" 157 f\"Could not find model id for inference server: {endpoint_url}\" 158 \"Make sure that your Hugging Face token has access to the endpoint.\" 159 )

ValueError: Failed to resolve model_id:Could not find model id for inference server: http://0.0.0.0:8080/Make sure that your Hugging Face token has access to the endpoint." }

Description

I try to create ChatHuggingFace model from Huggingface Text Generation Inference API (deploy my local model) Get Error ValueError: Failed to resolve model_id:Could not find model id for inference server: http://0.0.0.0:8080/Make sure that your Hugging Face token has access to the endpoint.

System Info

absl-py==1.4.0 accelerate==0.26.1 aiofiles==23.2.1 aiohttp @ file:///home/conda/feedstock_root/build_artifacts/aiohttp_1689804989543/work aiosignal @ file:///home/conda/feedstock_root/build_artifacts/aiosignal_1667935791922/work altair==5.0.1 annotated-types==0.5.0 antlr4-python3-runtime==4.9.3 anyio==3.7.1 appdirs==1.4.4 asttokens @ file:///home/conda/feedstock_root/build_artifacts/asttokens_1670263926556/work async-timeout @ file:///home/conda/feedstock_root/build_artifacts/async-timeout_1691763562544/work attrs @ file:///home/conda/feedstock_root/build_artifacts/attrs_1683424013410/work auto-gptq==0.6.0 autoawq==0.2.2 autoawq_kernels==0.0.6 backcall @ file:///home/conda/feedstock_root/build_artifacts/backcall_1592338393461/work backports.functools-lru-cache @ file:///home/conda/feedstock_root/build_artifacts/backports.functools_lru_cache_1687772187254/work beautifulsoup4==4.12.2 bigjson==1.0.9 bitsandbytes==0.42.0 black==23.7.0 brotlipy @ file:///home/conda/feedstock_root/build_artifacts/brotlipy_1666764672617/work cachetools @ file:///home/conda/feedstock_root/build_artifacts/cachetools_1633010882559/work certifi @ file:///home/conda/feedstock_root/build_artifacts/certifi_1700303426725/work/certifi cffi @ file:///home/conda/feedstock_root/build_artifacts/cffi_1671179360775/work charset-normalizer @ file:///home/conda/feedstock_root/build_artifacts/charset-normalizer_1688813409104/work cleanlab==2.5.0 click==8.1.7 cloudpickle==3.0.0 cmake==3.27.2 colorama @ file:///home/conda/feedstock_root/build_artifacts/colorama_1666700638685/work coloredlogs==15.0.1 comm @ file:///home/conda/feedstock_root/build_artifacts/comm_1691044910542/work contourpy==1.1.0 cryptography @ file:///home/conda/feedstock_root/build_artifacts/cryptography-split_1695163786734/work cycler @ file:///home/conda/feedstock_root/build_artifacts/cycler_1635519461629/work Cython==0.29.37 dataclasses-json==0.6.3 datasets==2.14.4 DateTime==5.4 debugpy @ file:///home/conda/feedstock_root/build_artifacts/debugpy_1691021247994/work decorator @ file:///home/conda/feedstock_root/build_artifacts/decorator_1641555617451/work dict==2020.12.3 dill==0.3.7 distro==1.8.0 docker-pycreds==0.4.0 docstring-parser==0.15 einops==0.7.0 et-xmlfile==1.1.0 evaluate==0.4.0 exceptiongroup==1.1.3 executing @ file:///home/conda/feedstock_root/build_artifacts/executing_1667317341051/work fastapi==0.101.1 ffmpy==0.3.1 filelock==3.12.2 fire==0.5.0 fonttools @ file:///home/conda/feedstock_root/build_artifacts/fonttools_1692542611950/work frozenlist @ file:///home/conda/feedstock_root/build_artifacts/frozenlist_1695377824562/work fsspec==2023.6.0 future==0.18.3 fvcore==0.1.5.post20221221 gdown==4.7.1 gekko==1.0.6 gitdb==4.0.10 GitPython==3.1.32 google-api-core @ file:///home/conda/feedstock_root/build_artifacts/google-api-core-split_1653881570487/work google-api-python-client @ file:///home/conda/feedstock_root/build_artifacts/google-api-python-client_1695664297279/work google-auth==2.23.3 google-auth-httplib2 @ file:///home/conda/feedstock_root/build_artifacts/google-auth-httplib2_1694516804909/work google-auth-oauthlib==1.1.0 googleapis-common-protos @ file:///home/conda/feedstock_root/build_artifacts/googleapis-common-protos-feedstock_1690830130005/work gradio==3.40.1 gradio_client==0.4.0 greenlet==3.0.1 grpcio==1.59.0 h11==0.14.0 hdbscan==0.8.33 htmlmin==0.1.12 httpcore==0.17.3 httplib2 @ file:///home/conda/feedstock_root/build_artifacts/httplib2_1679483503307/work httpx==0.24.1 huggingface-hub==0.20.3 humanfriendly==10.0 hydra-core==1.3.2 idna @ file:///home/conda/feedstock_root/build_artifacts/idna_1663625384323/work ImageHash @ file:///home/conda/feedstock_root/build_artifacts/imagehash_1664371213222/work importlib-metadata @ file:///home/conda/feedstock_root/build_artifacts/importlib-metadata_1688754491823/work importlib-resources==6.0.1 iopath==0.1.9 ipykernel @ file:///home/conda/feedstock_root/build_artifacts/ipykernel_1693880262622/work ipython @ file:///home/conda/feedstock_root/build_artifacts/ipython_1685727741709/work ipywidgets @ file:///home/conda/feedstock_root/build_artifacts/ipywidgets_1694607144474/work itables @ file:///home/conda/feedstock_root/build_artifacts/itables_1692399918721/work jedi @ file:///home/conda/feedstock_root/build_artifacts/jedi_1690896916983/work Jinja2 @ file:///home/conda/feedstock_root/build_artifacts/jinja2_1654302431367/work jiwer==3.0.3 joblib @ file:///home/conda/feedstock_root/build_artifacts/joblib_1691577114857/work json-lines==0.5.0 jsonlines==4.0.0 jsonpatch==1.33 jsonpointer==2.4 jsonschema==4.19.0 jsonschema-specifications==2023.7.1 jupyter_client @ file:///home/conda/feedstock_root/build_artifacts/jupyter_client_1687700988094/work jupyter_core @ file:///home/conda/feedstock_root/build_artifacts/jupyter_core_1686775603087/work jupyterlab-widgets @ file:///home/conda/feedstock_root/build_artifacts/jupyterlab_widgets_1694598704522/work kiwisolver==1.4.4 langchain==0.1.11 langchain-community==0.0.25 langchain-core==0.1.29 langchain-text-splitters==0.0.1 langsmith==0.1.22 linkify-it-py==2.0.2 lit==16.0.6 llvmlite==0.41.1 loralib==0.1.1 Markdown==3.5 markdown-it-py==2.2.0 MarkupSafe @ file:///home/conda/feedstock_root/build_artifacts/markupsafe_1685769048265/work marshmallow==3.20.1 matplotlib @ file:///home/conda/feedstock_root/build_artifacts/matplotlib-suite_1661440538658/work matplotlib-inline @ file:///home/conda/feedstock_root/build_artifacts/matplotlib-inline_1660814786464/work mdit-py-plugins==0.3.3 mdurl==0.1.2 mock==5.1.0 mpmath==1.3.0 msal==1.26.0 multidict @ file:///home/conda/feedstock_root/build_artifacts/multidict_1672339396340/work multimethod @ file:///home/conda/feedstock_root/build_artifacts/multimethod_1603129052241/work multiprocess==0.70.15 munkres==1.1.4 mypy-extensions==1.0.0 nb-conda-kernels @ file:///home/conda/feedstock_root/build_artifacts/nb_conda_kernels_1667060622050/work neo4j==5.16.0 nest-asyncio @ file:///home/conda/feedstock_root/build_artifacts/nest-asyncio_1664684991461/work networkx @ file:///home/conda/feedstock_root/build_artifacts/networkx_1680692919326/work nltk==3.8.1 nose==1.3.7 numba==0.58.1 numpy @ file:///home/conda/feedstock_root/build_artifacts/numpy_1668919081525/work nvidia-cublas-cu11==11.10.3.66 nvidia-cublas-cu12==12.1.3.1 nvidia-cuda-cupti-cu11==11.7.101 nvidia-cuda-cupti-cu12==12.1.105 nvidia-cuda-nvrtc-cu11==11.7.99 nvidia-cuda-nvrtc-cu12==12.1.105 nvidia-cuda-runtime-cu11==11.7.99 nvidia-cuda-runtime-cu12==12.1.105 nvidia-cudnn-cu11==8.5.0.96 nvidia-cudnn-cu12==8.9.2.26 nvidia-cufft-cu11==10.9.0.58 nvidia-cufft-cu12==11.0.2.54 nvidia-curand-cu11==10.2.10.91 nvidia-curand-cu12==10.3.2.106 nvidia-cusolver-cu11==11.4.0.1 nvidia-cusolver-cu12==11.4.5.107 nvidia-cusparse-cu11==11.7.4.91 nvidia-cusparse-cu12==12.1.0.106 nvidia-nccl-cu11==2.14.3 nvidia-nccl-cu12==2.18.1 nvidia-nvjitlink-cu12==12.3.101 nvidia-nvtx-cu11==11.7.91 nvidia-nvtx-cu12==12.1.105 oauth2client==4.1.3 oauthlib==3.2.2 omegaconf==2.3.0 openai==0.28.0 opencv-python==4.8.1.78 openpyxl==3.1.2 optimum==1.17.1 optimum-intel==1.15.2 orjson==3.9.15 packaging==23.2 pandas==1.5.3 pandas-profiling @ file:///home/conda/feedstock_root/build_artifacts/pandas-profiling_1674670576924/work parso @ file:///home/conda/feedstock_root/build_artifacts/parso_1638334955874/work pathspec==0.11.2 pathtools==0.1.2 patsy @ file:///home/conda/feedstock_root/build_artifacts/patsy_1665356157073/work peft==0.8.2 pexpect @ file:///home/conda/feedstock_root/build_artifacts/pexpect_1667297516076/work phik @ file:///home/conda/feedstock_root/build_artifacts/phik_1670564192669/work pickleshare @ file:///home/conda/feedstock_root/build_artifacts/pickleshare_1602536217715/work Pillow==10.0.0 platformdirs @ file:///home/conda/feedstock_root/build_artifacts/platformdirs_1690813113769/work portalocker==2.8.2 prompt-toolkit @ file:///home/conda/feedstock_root/build_artifacts/prompt-toolkit_1688565951714/work promptlayer==0.4.0 protobuf==3.20.3 psutil @ file:///home/conda/feedstock_root/build_artifacts/psutil_1681775019467/work ptyprocess @ file:///home/conda/feedstock_root/build_artifacts/ptyprocess_1609419310487/work/dist/ptyprocess-0.7.0-py2.py3-none-any.whl pure-eval @ file:///home/conda/feedstock_root/build_artifacts/pure_eval_1642875951954/work py-vncorenlp==0.1.4 pyArango==2.0.2 pyarrow==12.0.1 pyasn1 @ file:///home/conda/feedstock_root/build_artifacts/pyasn1_1694615621498/work pyasn1-modules @ file:///home/conda/feedstock_root/build_artifacts/pyasn1-modules_1695107857548/work pycocotools==2.0.7 pycparser @ file:///home/conda/feedstock_root/build_artifacts/pycparser_1636257122734/work pydantic @ file:///home/conda/feedstock_root/build_artifacts/pydantic_1690476225427/work pydantic_core==2.6.1 PyDrive==1.3.1 pydub==0.25.1 Pygments @ file:///home/conda/feedstock_root/build_artifacts/pygments_1691408637400/work pyjnius==1.6.0 PyJWT==2.8.0 pynndescent==0.5.11 pyOpenSSL @ file:///home/conda/feedstock_root/build_artifacts/pyopenssl_1685514481738/work pyparsing==3.0.9 PySocks @ file:///home/conda/feedstock_root/build_artifacts/pysocks_1661604839144/work python-arango==7.9.0 python-crfsuite==0.9.9 python-dateutil @ file:///home/conda/feedstock_root/build_artifacts/python-dateutil_1626286286081/work python-multipart==0.0.6 pytz==2023.3 pyu2f @ file:///home/conda/feedstock_root/build_artifacts/pyu2f_1604248910016/work PyWavelets @ file:///home/conda/feedstock_root/build_artifacts/pywavelets_1673082327051/work PyYAML @ file:///home/conda/feedstock_root/build_artifacts/pyyaml_1692737146376/work pyzmq @ file:///home/conda/feedstock_root/build_artifacts/pyzmq_1691667452339/work rapidfuzz==3.5.2 referencing==0.30.2 regex==2023.8.8 requests @ file:///home/conda/feedstock_root/build_artifacts/requests_1680286922386/work requests-oauthlib==1.3.1 requests-toolbelt==1.0.0 responses==0.18.0 rich==13.7.0 rouge==1.0.1 rouge-score==0.1.2 rpds-py==0.9.2 rsa @ file:///home/conda/feedstock_root/build_artifacts/rsa_1658328885051/work safetensors==0.4.2 scikit-learn==1.3.0 scipy==1.11.2 seaborn @ file:///home/conda/feedstock_root/build_artifacts/seaborn-split_1672497695270/work semantic-version==2.10.0 sentence-transformers==2.2.2 sentencepiece==0.1.99 sentry-sdk==1.29.2 seqeval==1.2.2 setproctitle==1.3.2 shtab==1.6.4 simplejson==3.19.2 six @ file:///home/conda/feedstock_root/build_artifacts/six_1620240208055/work skorch==0.15.0 smmap==5.0.0 sniffio==1.3.0 soupsieve==2.5 SQLAlchemy==2.0.23 stack-data @ file:///home/conda/feedstock_root/build_artifacts/stack_data_1669632077133/work starlette==0.27.0 statsmodels @ file:///croot/statsmodels_1676643798791/work sympy==1.12 tabulate==0.9.0 tangled-up-in-unicode @ file:///home/conda/feedstock_root/build_artifacts/tangled-up-in-unicode_1632832610704/work tenacity==8.2.3 tensorboard==2.15.0 tensorboard-data-server==0.7.2 termcolor==2.3.0 text-generation==0.6.1 threadpoolctl==3.2.0 tiktoken==0.5.2 tokenize-rt==5.2.0 tokenizers==0.15.2 tomli==2.0.1 toolz==0.12.0 torch==2.1.2 torchinfo==1.8.0 torchvision==0.16.2 tornado @ file:///home/conda/feedstock_root/build_artifacts/tornado_1692311754787/work tqdm @ file:///home/conda/feedstock_root/build_artifacts/tqdm_1662214488106/work traitlets @ file:///home/conda/feedstock_root/build_artifacts/traitlets_1675110562325/work transformers==4.37.0 trash-cli==0.23.2.13.2 triton==2.1.0 trl==0.7.4 typeguard @ file:///home/conda/feedstock_root/build_artifacts/typeguard_1658932097418/work typing==3.7.4.3 typing-inspect==0.9.0 typing_extensions==4.10.0 tyro==0.5.17 tzdata==2023.3 uc-micro-py==1.0.2 umap-learn==0.5.5 underthesea==6.7.0 underthesea_core==1.0.4 unicodedata2 @ file:///home/conda/feedstock_root/build_artifacts/unicodedata2_1667239485250/work uritemplate @ file:///home/conda/feedstock_root/build_artifacts/uritemplate_1634152692041/work urllib3 @ file:///home/conda/feedstock_root/build_artifacts/urllib3_1678635778344/work uvicorn==0.23.2 values==2020.12.3 visions @ file:///home/conda/feedstock_root/build_artifacts/visions_1638743854326/work wandb==0.15.12 wcwidth @ file:///home/conda/feedstock_root/build_artifacts/wcwidth_1673864653149/work websockets==11.0.3 Werkzeug==3.0.1 widgetsnbextension @ file:///home/conda/feedstock_root/build_artifacts/widgetsnbextension_1694598693908/work xxhash==3.3.0 yacs==0.1.8 yarl @ file:///home/conda/feedstock_root/build_artifacts/yarl_1685191803031/work zipp @ file:///home/conda/feedstock_root/build_artifacts/zipp_1689374466814/work zope.interface==6.1 zstandard==0.22.0

anhnh2002 commented 8 months ago

still error here

Taimoor0217 commented 6 months ago

Hi everyone, I also came across the same issue, seems like something might be wrong here with the list_inference_endpoints method is the HuggingFaceHub package.

Anyways, I was able to resolve this issue by explicitly passing in a model_id when initiating the chat model.

For example, in the example below, I explicitly pass in the model_ for llama-3-8b-instruct

llm = HuggingFaceTextGenInference(
    inference_server_url=os.environ['LLAMA_INSTRUCT_URL'],
    max_new_tokens=512,
    top_k=50,
    temperature=0.1,
    repetition_penalty=1.03,
    server_kwargs={
        "headers": {
            "Authorization": f"Bearer {os.environ['HF_TOKEN']}",
            "Content-Type": "application/json",
        }
    },
)
chat_model = ChatHuggingFace(llm=llm,  model_id='meta-llama/Meta-Llama-3-8B-Instruct')