Open prhbrt opened 4 days ago
Because this is a breaking issue for us, I did the following to monkey-patch it:
Injected a patched def get_hf_task_embedding_for_model
in litellm/llms/huggingface_restapi.py
into the container, e.g. updating get_hf_task_embedding_for_model
with a try-except, since the model_info_dict.get
defaults to None anyway.
I didn't understand the logging-system, i.e. this didn't show up in the logs after setting LITELLM_LOG=DEBUG
. Other debug info was shown though, so the variable did come through. I noticed there are different loggers, e.g. verbose_logger
, but couldn't find anything about that in the docs. So I simply debugged to a /oops.log
file. This verrified that api_base
indeed is /
on the text-embeddings-inference
instance, i.e. http://localhost:8006/
, and the response is indeed empty.
def get_hf_task_embedding_for_model(
model: str, task_type: Optional[str], api_base: str
) -> Optional[str]:
if task_type is not None:
if task_type in get_args(hf_tasks_embeddings):
return task_type
else:
raise Exception(
"Invalid task_type={}. Expected one of={}".format(
task_type, hf_tasks_embeddings
)
)
http_client = HTTPHandler(concurrent_limit=1)
model_info = http_client.get(url=api_base)
# with open('/oops.log', 'a') as f:
# f.write('api_base: ')
# f.write(api_base)
# f.write('\ntext: \n')
# f.write(model_info.text)
# f.write('\n\n')
try:
model_info_dict = model_info.json()
pipeline_tag: Optional[str] = model_info_dict.get("pipeline_tag", None)
except:
pipeline_tag = None
return pipeline_tag
Using Dockerfile:
# syntax=docker/dockerfile:1
FROM ghcr.io/berriai/litellm:main-v1.52.0
COPY ./huggingface_restapi.py /usr/local/lib/python3.11/site-packages/litellm/llms/huggingface_restapi.py
Now everything works, but of course, this is just a patch. I couldn't find out what pipeline_tag
is supposed to reference to, so I also couldn't do better than to just ditch it (sorry).
Let me know if I can do any more debugging for you.
What happened?
Using config
which connects to:
Then, trying the litellm-wrapper around the embedding api:
Results in this line crashing:
Which seems like litellm connects to the base url (
api_base
), which indeed gives an empty response (should this be{api_base}/info
?)However, text-embeddings-inference should be supported according to this doc
Relevant log output
(repeated)