Some providers (e.g. HuggingFace) not working in chat nor in streaming completer

krassowski commented 2 months ago

Description

Traceback (most recent call last):
  File "/jupyter_ai/chat_handlers/base.py", line 170, in on_message
    await self.process_message(message)
  File "/jupyter_ai/chat_handlers/default.py", line 104, in process_message
    async for chunk in self.llm_chain.astream(
  File "/langchain_core/runnables/base.py", line 4698, in astream
    async for item in self.bound.astream(
  File "/langchain_core/runnables/base.py", line 4698, in astream
    async for item in self.bound.astream(
  File "/langchain_core/runnables/base.py", line 2900, in astream
    async for chunk in self.atransform(input_aiter(), config, **kwargs):
  File "/langchain_core/runnables/base.py", line 2883, in atransform
    async for chunk in self._atransform_stream_with_config(
  File "/langchain_core/runnables/base.py", line 1984, in _atransform_stream_with_config
    chunk = cast(Output, await py_anext(iterator))
  File "/langchain_core/runnables/base.py", line 2853, in _atransform
    async for output in final_pipeline:
  File "/langchain_core/runnables/base.py", line 4734, in atransform
    async for item in self.bound.atransform(
  File "/langchain_core/runnables/base.py", line 2883, in atransform
    async for chunk in self._atransform_stream_with_config(
  File "/langchain_core/runnables/base.py", line 1984, in _atransform_stream_with_config
    chunk = cast(Output, await py_anext(iterator))
  File "/langchain_core/runnables/base.py", line 2853, in _atransform
    async for output in final_pipeline:
  File "/langchain_core/runnables/base.py", line 1333, in atransform
    async for output in self.astream(final, config, **kwargs):
  File "/langchain_core/language_models/llms.py", line 492, in astream
    raise e
  File "/langchain_core/language_models/llms.py", line 475, in astream
    async for chunk in self._astream(
  File "/langchain_community/llms/huggingface_endpoint.py", line 345, in _astream
    async for response in await self.async_client.text_generation(
AttributeError: 'NoneType' object has no attribute 'text_generation'

Reproduce

Set any model from from hugging face hub on, try to use chat or completion with streaming enabled

Expected behavior

Hugging face hub models work

Context

main at 2019571. The issue with streaming in inline completer is pre-existing, but chat got broken only rececntly, since:

https://github.com/jupyterlab/jupyter-ai/pull/859

This might be an upstream issue, or maybe we should have a way to detect when a model does not support streaming.

3coins commented 2 months ago

The problem might be that we are overriding the HuggingFace provider which is missing updates from their official LLMs in LangChain. For example, we are not declaring the async_client in the root_validator here, which is causing this error. https://github.com/jupyterlab/jupyter-ai/blob/main/packages/jupyter-ai-magics/jupyter_ai_magics/providers.py#L614

We should remove any custom code from here, and use the provider as-is from LangChain.

pedrogutobjj commented 2 months ago

Excited for the new version, when will we get it? ollama completion models help me a lot!

krassowski commented 2 months ago

We should remove any custom code from here, and use the provider as-is from LangChain.

This would remove the image generation capability as implemented by @JasonWeill in https://github.com/jupyterlab/jupyter-ai/pull/66, right?

JasonWeill commented 2 months ago

To be fair, the image generator had limited support with HuggingFace models. If there's an alternate way to use the HF models to make images that is more widely supported, I'd love to adopt that.

krassowski commented 2 months ago

I think the clean implementation in the langchain ecosystem would be one layer higher, using the Tools/function calling strategy. It is possible to force model to call a certain tool.

There are existing tools that generate images:

Presumably making a HuggingFace tool would not be too difficult.

The limitation is that only some models support function calling (AzureChatOpenAI, ChatAnthropic, ChatCohere, ChatFireworks, ChatGroq, ChatMistralAI, ChatOpenAI, ChatVertexAI) and ChatHuggingFace is not one of them (at least as of v0.1). The full list is in: https://python.langchain.com/v0.1/docs/integrations/chat/

krassowski commented 2 months ago

An alternative is to extract the current HuggingFace implementation to a new provider which would be marked as providing no chat and no inline completion models, but only an image generation model. In either case changes would be breaking.

krassowski commented 2 months ago

Having a tool integration though would be very nice, like imagine asking for something in the chat and also getting an illustration with an image.

krassowski commented 2 months ago

https://github.com/jupyterlab/jupyter-ai/blob/main/packages/jupyter-ai-magics/jupyter_ai_magics/providers.py#L614

We should remove any custom code from here, and use the provider as-is from LangChain.

Actually, on taking second look this function is not doing anything useful since https://github.com/jupyterlab/jupyter-ai/pull/784 because it no longer overrides the tasks list:

https://github.com/jupyterlab/jupyter-ai/blob/8b6c82d8dd24953ab289ccf05f25d0c5f5b63475/packages/jupyter-ai-magics/jupyter_ai_magics/providers.py#L630-L657

So it can be simply deleted :shrug:

jupyterlab / jupyter-ai