langflow-ai / langflow

Langflow is a low-code app builder for RAG and multi-agent AI applications. It’s Python-based and agnostic to any model, API, or database.
http://www.langflow.org
MIT License
33.01k stars 4.03k forks source link

langflow and vllm/openai api #3452

Closed michelgirault closed 2 months ago

michelgirault commented 2 months ago

Bug Description

I'm very new with langflow and i was interested to leverage vllm with langflow via the openapi I tired with a few llava models without success langflow does not recognize the name of my model i use llava-hf/llava-1.5-7b-hf i dont get any output and this is the log from vllm

INFO 08-20 14:53:45 metrics.py:406] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%.

INFO 08-20 14:53:50 logger.py:36] Received request chat-a64b39d6cbd94a6ea227f98349e1f17d: prompt: '<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\nAnswer the user as if you were a pirate.\n\nUser: \n\nAnswer:<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n', params: SamplingParams(n=1, best_of=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.1, top_p=1.0, top_k=-1, min_p=0.0, seed=1, use_beam_search=False, length_penalty=1.0, early_stopping=False, stop=[], stop_token_ids=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=8167, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None), prompt_token_ids: [128000, 128006, 882, 128007, 271, 16533, 279, 1217, 439, 422, 499, 1051, 264, 55066, 382, 1502, 25, 4815, 16533, 25, 128009, 128006, 78191, 128007, 271], lora_request: None, prompt_adapter_request: None. INFO 08-20 14:53:50 async_llm_engine.py:174] Added request chat-a64b39d6cbd94a6ea227f98349e1f17d.

I tried with lamacpp python without any problem, i was wondering if there is known issue between vllm and langflow

Reproduction

  1. start the vllm server: vllm serve xtuner/llava-llama-3-8b-v1_1-transformers --api-key none --served-model-name gpt-4o
  2. run langflow and use the openai node with the openai base api (vllm server) and the model (gpt-4o)

Expected behavior

i would expect the flow and build to pass successfully and in the playground to see the answer

Who can help?

No response

Operating System

ubuntu 22

Langflow Version

v1.0.15

Python Version

3.10

Screenshot

No response

Flow File

No response

dosubot[bot] commented 2 months ago

@michelgirault

Hello there! I'm here to help you with any issues, questions, or contributions you have. Let's work together to solve this bug.

Langflow requires specific configurations to recognize and work with custom model names. The model name you are using, llava-hf/llava-1.5-7b-hf, is not recognized by default. You need to ensure that your custom model names are included in the MODEL_NAMES list or configure Langflow to recognize them.

Here are the steps to configure Langflow to recognize your custom model names:

  1. Add Custom Model Names: Ensure that your custom model names are added to the MODEL_NAMES list. You can modify the openai_constants.py file to include your custom model names:

    MODEL_NAMES = [
       "gpt-4o",
       "gpt-4-turbo",
       "gpt-4-turbo-preview",
       "gpt-3.5-turbo",
       "gpt-3.5-turbo-0125",
       "llama3-8b-8192",
       "llama3-70b-8192",
       "mixtral-8x7b-32768",
       "gemma-7b-it",
       "llava-hf/llava-1.5-7b-hf"  # Add your custom model name here
    ]
  2. Specify Custom Components Path: If you have custom components, specify the path to the directory containing them using the --components-path option or the LANGFLOW_COMPONENTS_PATH environment variable:

    langflow run --components-path /path/to/custom/components

    Or set the environment variable:

    export LANGFLOW_COMPONENTS_PATH=/path/to/custom/components
  3. Check API Connection: Ensure that Langflow can connect to the vllm server. If there is an issue connecting to the API, you might see an error like:

    raise ValueError("Could not connect to AzureOpenAI API.") from e

By following these steps, you should be able to configure Langflow to recognize and work with your custom model names, such as llava-hf/llava-1.5-7b-hf [1][2][3].

To continue talking to Dosu, mention @dosu.

michelgirault commented 2 months ago

is there not a simple way to update dynamically the model list?

dosubot[bot] commented 2 months ago

Yes, Langflow provides a way to dynamically update the model list without modifying the source code directly. You can use the --components-path command-line option or the LANGFLOW_COMPONENTS_PATH environment variable to specify the path to a directory containing custom components, which can include additional model names.

Here's how you can do it:

  1. Create a Custom Component: Define a custom component that includes your model names. For example, create a Python file with the following content:

    from langflow import CustomComponent
    
    class CustomModelComponent(CustomComponent):
       display_name: str = "Custom Model"
       description: str = "Custom LLM model."
    
       def build_config(self):
           return {
               "model_id": {
                   "display_name": "Model Id",
                   "options": [
                       "llava-hf/llava-1.5-7b-hf",  # Add your custom model name here
                       # Add more custom model names if needed
                   ],
               },
               "credentials_profile_name": {"display_name": "Credentials Profile Name"},
           }
  2. Specify the Custom Components Path: Place the custom component file in a directory, and then specify the path to this directory using the --components-path option or the LANGFLOW_COMPONENTS_PATH environment variable.

    Using the command-line option:

    langflow run --components-path /path/to/custom/components

    Or set the environment variable:

    export LANGFLOW_COMPONENTS_PATH=/path/to/custom/components

By following these steps, you can dynamically update the model list in Langflow without modifying the source code directly [1][2].

To continue talking to Dosu, mention @dosu.

michelgirault commented 2 months ago

I tried without much success and running Screenshot from 2024-08-21 10-10-05 where i saved the custom component. the list appear the same as before

in my container docker the app run like this "python3 -m langflow run --components-path /app/components" I still dont understand who the model list would update and the documentation is very limited.

dosubot[bot] commented 2 months ago

To dynamically update the model list in Langflow when running the app in a Docker container, you need to ensure that the --components-path argument points to a directory that can be updated dynamically. This can be achieved by mounting a volume to the Docker container that maps to the /app/components directory.

Here is an example of how you can run the Docker container with a volume mount:

docker run -v /path/to/your/components:/app/components langflowai/langflow:latest python3 -m langflow run --components-path /app/components

In this command:

By using this volume mount, any changes you make to the components in /path/to/your/components on your host machine will be reflected inside the Docker container at /app/components, allowing Langflow to dynamically update the model list.

Additionally, if you are using the OllamaModel component, ensure that the base_url is correctly set to point to the Ollama API endpoint. The model field in the component configuration can be dynamically updated by fetching the available models from the Ollama API. This is handled in the update_build_config method of the ChatOllamaComponent class, which updates the model options based on the response from the API.

def update_build_config(self, build_config: dict, field_value: Any, field_name: str | None = None):
    if field_name == "model":
        base_url_dict = build_config.get("base_url", {})
        base_url_load_from_db = base_url_dict.get("load_from_db", False)
        base_url_value = base_url_dict.get("value")
        if base_url_load_from_db:
            base_url_value = self.variables(base_url_value)
        elif not base_url_value:
            base_url_value = "http://localhost:11434"
        build_config["model"]["options"] = self.get_model(base_url_value + "/api/tags")
    return build_config

def get_model(self, url: str) -> list[str]:
    try:
        with httpx.Client() as client:
            response = client.get(url)
            response.raise_for_status()
            data = response.json()
            model_names = [model["name"] for model in data.get("models", [])]
            return model_names
    except Exception as e:
        raise ValueError("Could not retrieve models. Please, make sure Ollama is running.") from e

This ensures that the model list is dynamically updated based on the available models from the Ollama API [1][2][3].

To continue talking to Dosu, mention @dosu.

michelgirault commented 2 months ago

what about openai api node as mentioned before I'm trying to use open ai api only

dosubot[bot] commented 2 months ago

To dynamically update the model list for the OpenAI API node in Langflow, you need to modify the MODEL_NAMES list in the OpenAIModelComponent class. This list is used to populate the options for the model_name dropdown input.

Here is the relevant part of the code where MODEL_NAMES is used:

from langflow.base.models.openai_constants import MODEL_NAMES

class OpenAIModelComponent(LCModelComponent):
    display_name = "OpenAI"
    description = "Generates text using OpenAI LLMs."
    icon = "OpenAI"

    inputs = [
        # Other inputs...
        DropdownInput(
            name="model_name", 
            display_name="Model Name", 
            advanced=False, 
            options=MODEL_NAMES, 
            value=MODEL_NAMES[0]
        ),
        # Other inputs...
    ]
    # Other class methods and attributes...

To dynamically update the MODEL_NAMES list, you can modify the MODEL_NAMES variable in the langflow.base.models.openai_constants module. This can be done by fetching the latest model names from the OpenAI API and updating the list accordingly.

Here is an example of how you might achieve this:

  1. Fetch the latest model names from the OpenAI API.
  2. Update the MODEL_NAMES list with the fetched model names.

Example code to fetch and update MODEL_NAMES:

import requests
from langflow.base.models.openai_constants import MODEL_NAMES

def update_model_names(api_key):
    headers = {
        "Authorization": f"Bearer {api_key}"
    }
    response = requests.get("https://api.openai.com/v1/models", headers=headers)
    if response.status_code == 200:
        models = response.json().get("data", [])
        model_names = [model["id"] for model in models]
        MODEL_NAMES.clear()
        MODEL_NAMES.extend(model_names)
    else:
        raise Exception("Failed to fetch model names from OpenAI API")

# Call this function with your OpenAI API key to update the model names
update_model_names("your_openai_api_key")

After running this function, the MODEL_NAMES list will be updated with the latest model names from the OpenAI API, and the dropdown in the OpenAIModelComponent will reflect these changes [1][[2]](https://github.com/logspace-ai/langflow/blob/dev/src/backend/base/langflow/initial_setup/starter_projects/Langflow Blog Writter.json).

To continue talking to Dosu, mention @dosu.

michelgirault commented 2 months ago

in my case this is not my questions, I mentioned befire I use vllm as openai api replacement so the list of model are not coming from open ai

dosubot[bot] commented 2 months ago

To dynamically update the model list in Langflow when using vllm as an OpenAI API replacement, you need to modify the options field in the configuration for model_name. Here is an example of how you can update the model list:

{
    "model": {
        "display_name": "Model name",
        "field_type": "str",
        "advanced": False,
        "required": True,
        "info": "The name of the model to use. For example, `gpt-3.5-turbo`.",
        "options": [
            "vllm-model-1",
            "vllm-model-2",
            "vllm-model-3"
        ],
    },
    "api_key": {
        "display_name": "API key",
        "field_type": "str",
        "advanced": False,
        "required": False,
        "password": True,
    },
    "provider": {
        "display_name": "Provider",
        "info": "The provider of the API key.",
        "options": [
            "OpenAI",
            "Azure",
            "Anthropic",
            "Replicate",
            "Cohere",
            "OpenRouter",
        ],
    },
    "streaming": {
        "display_name": "Streaming",
        "field_type": "bool",
        "advanced": True,
        "required": False,
        "default": True,
    },
    "temperature": {
        "display_name": "Temperature",
        "field_type": "float",
        "advanced": False,
        "required": False,
        "default": 0.7,
    },
    "model_kwargs": {
        "display_name": "Model kwargs",
        "field_type": "dict",
        "advanced": True,
        "required": False,
        "default": {},
    },
    "top_p": {
        "display_name": "Top p",
        "field_type": "float",
        "advanced": True,
        "required": False,
    },
    "top_k": {
        "display_name": "Top k",
        "field_type": "int",
        "advanced": True,
        "required": False,
    },
    "n": {
        "display_name": "N",
        "field_type": "int",
        "advanced": True,
        "required": False,
        "info": "Number of chat completions to generate for each prompt. Note that the API may not return the full n completions if duplicates are generated.",
        "default": 1,
    },
    "max_tokens": {
        "display_name": "Max Tokens",
        "advanced": True,
        "info": "The maximum number of tokens to generate. Set to 0 for unlimited tokens.",
    },
    "max_retries": {
        "display_name": "Max retries",
        "field_type": "int",
        "advanced": True,
        "required": False,
        "default": 6,
    },
    "verbose": {
        "display_name": "Verbose",
        "field_type": "bool",
        "advanced": True,
        "required": False,
        "default": False,
    },
}

In this example, the options array under model has been updated to include vllm-model-1, vllm-model-2, and vllm-model-3. This will dynamically update the model list to use these vllm models instead of the default OpenAI models.

Additionally, you can refer to the ChatLiteLLMModelComponent class in Langflow, which demonstrates how to dynamically update the model list based on available models from an API. Specifically, the build method in the ChatLiteLLMModelComponent class shows how to fetch available models and update the options field dynamically:

class ChatLiteLLMModelComponent(LCModelComponent):
    display_name = "LiteLLM"
    description = "`LiteLLM` collection of large language models."
    documentation = "https://python.langchain.com/docs/integrations/chat/litellm"
    field_order = [
        "model",
        "api_key",
        "provider",
        "temperature",
        "model_kwargs",
        "top_p",
        "top_k",
        "n",
        "max_tokens",
        "max_retries",
        "verbose",
        "stream",
        "input_value",
        "system_message",
    ]

    def build_config(self):
        return {
            "model": {
                "display_name": "Model name",
                "field_type": "str",
                "advanced": False,
                "required": True,
                "info": "The name of the model to use. For example, `gpt-3.5-turbo`.",
            },
            "api_key": {
                "display_name": "API key",
                "field_type": "str",
                "advanced": False,
                "required": False,
                "password": True,
            },
            "provider": {
                "display_name": "Provider",
                "info": "The provider of the API key.",
                "options": [
                    "OpenAI",
                    "Azure",
                    "Anthropic",
                    "Replicate",
                    "Cohere",
                    "OpenRouter",
                ],
            },
            "temperature": {
                "display_name": "Temperature",
                "field_type": "float",
                "advanced": False,
                "required": False,
                "default": 0.7,
            },
            "model_kwargs": {
                "display_name": "Model kwargs",
                "field_type": "dict",
                "advanced": True,
                "required": False,
                "default": {},
            },
            "top_p": {
                "display_name": "Top p",
                "field_type": "float",
                "advanced": True,
                "required": False,
            },
            "top_k": {
                "display_name": "Top k",
                "field_type": "int",
                "advanced": True,
                "required": False,
            },
            "n": {
                "display_name": "N",
                "field_type": "int",
                "advanced": True,
                "required": False,
                "info": "Number of chat completions to generate for each prompt. Note that the API may not return the full n completions if duplicates are generated.",
                "default": 1,
            },
            "max_tokens": {
                "display_name": "Max tokens",
                "field_type": "int",
                "advanced": False,
                "required": False,
                "default": 256,
                "info": "The maximum number of tokens to generate for each chat completion.",
            },
            "max_retries": {
                "display_name": "Max retries",
                "field_type": "int",
                "advanced": True,
                "required": False,
                "default": 6,
            },
            "verbose": {
                "display_name": "Verbose",
                "field_type": "bool",
                "advanced": True,
                "required": False,
                "default": False,
            },
            "input_value": {"display_name": "Input"},
            "stream": {
                "display_name": "Stream",
                "info": STREAM_INFO_TEXT,
                "advanced": True,
            },
            "system_message": {
                "display_name": "System Message",
                "info": "System message to pass to the model.",
                "advanced": True,
            },
        }

    def build(
        self,
        input_value: Text,
        model: str,
        provider: str,
        api_key: Optional[str] = None,
        stream: bool = False,
        temperature: Optional[float] = 0.7,
        model_kwargs: Optional[Dict[str, Any]] = {},
        top_p: Optional[float] = None,
        top_k: Optional[int] = None,
        n: int = 1,
        max_tokens: int = 256,
        max_retries: int = 6,
        verbose: bool = False,
        system_message: Optional[str] = None,
    ) -> Text:
        try:
            import litellm  # type: ignore

            litellm.drop_params = True
            litellm.set_verbose = verbose
        except ImportError:
            raise ChatLiteLLMException(
                "Could not import litellm python package. " "Please install it with `pip install litellm`"
            )
        provider_map = {
            "OpenAI": "openai_api_key",
            "Azure": "azure_api_key",
            "Anthropic": "anthropic_api_key",
            "Replicate": "replicate_api_key",
            "Cohere": "cohere_api_key",
            "OpenRouter": "openrouter_api_key",
        }
        # Set the API key based on the provider
        api_keys: dict[str, Optional[str]] = {v: None for v in provider_map.values()}

        if variable_name := provider_map.get(provider):
            api_keys[variable_name] = api_key
        else:
            raise ChatLiteLLMException(
                f"Provider {provider} is not supported. Supported providers are: {', '.join(provider_map.keys())}"
            )
        output = ChatLiteLLM(
            model=model,
            client=None,
            streaming=stream,
            temperature=temperature,
            model_kwargs=model_kwargs if model_kwargs is not None else {},
            top_p=top_p,
            top_k=top_k,
            n=n,
            max_tokens=max_tokens,
            max_retries=max_retries,
            openai_api_key=api_keys["openai_api_key"],
            azure_api_key=api_keys["azure_api_key"],
            anthropic_api_key=api_keys["anthropic_api_key"],
            replicate_api_key=api_keys["replicate_api_key"],
            cohere_api_key=api_keys["cohere_api_key"],
            openrouter_api_key=api_keys["openrouter_api_key"],
        )
        return self.get_chat_result(output, stream, input_value, system_message)

You can adapt this approach to dynamically update the model list for vllm models by fetching the available models from the vllm API and updating the options field accordingly [1][2][3][[4]](https://github.com/langflow-ai/langflow/blob/dev/src/backend/base/langflow/initial_setup/starter_projects/Langflow Prompt Chaining.json)[5].

To continue talking to Dosu, mention @dosu.

dmentx commented 2 months ago

@michelgirault Did you try to build your own custom component for that?

michelgirault commented 2 months ago

not yet but i will try, i think it should work like that

carlosrcoelho commented 2 months ago

@michelgirault

The VLLMOpen AI is not native to Langflow; it was created by the community. Thank you for sharing your question. The way you could make it work is by creating your own custom component.