Azure-Samples / contoso-chat

This sample has the full End2End process of creating RAG application with Prompt Flow and AI Studio. It includes GPT 3.5 Turbo LLM application code, evaluations, deployment automation with AZD CLI, GitHub actions for evaluation and deployment and intent mapping for multiple LLM task mapping.
MIT License
366 stars 2.28k forks source link

Promptflow huggingface api swapping encounter template error issues for mixtral 7b #136

Open JimyhZhu opened 2 weeks ago

JimyhZhu commented 2 weeks ago

Hi I am using the prompt flow workshop branches, and when i was trying to integrate a hugging face serveless api into the promptflow, i encounter the issues specified below:

i added connections to the pfClient, and swapped the model in the flow.dag.yaml, when running the flow.dag.yaml file the error below occurs:

Screenshot 2024-06-12 at 12 30 35

First i added the connection in /connection/create-connections.ipynb [

Screenshot 2024-06-12 at 12 36 27

](url)

from promptflow._sdk.entities._connection import ServerlessConnection
HF_KEY = 

HF_endpoints = {"meta_llama3_instruct_8B":"https://api-inference.huggingface.co/models/meta-llama/Meta-Llama-3-8B-Instruct","meta_llama3_instruct_70B":"https://api-inference.huggingface.co/models/meta-llama/Meta-Llama-3-70B-Instruct",
                "meta_llama3_8B":"https://api-inference.huggingface.co/models/meta-llama/Meta-Llama-3-8B","meta_llama3_70B":"https://api-inference.huggingface.co/models/meta-llama/Meta-Llama-3-70B",
                "gpt2":"https://api-inference.huggingface.co/models/openai-community/gpt2",
                "Phi_3_mini_4k_instruct":"https://api-inference.huggingface.co/models/microsoft/Phi-3-mini-4k-instruct","Phi_3_mini_128k_instruct":"https://api-inference.huggingface.co/models/microsoft/Phi-3-mini-128k-instruct",
                "google_gemma":"https://api-inference.huggingface.co/models/google/gemma-1.1-7b-it",
                "Mixtral": "https://api-inference.huggingface.co/models/mistralai/Mixtral-8x7B-Instruct-v0.1", "Mixtral7B":"https://api-inference.huggingface.co/models/mistralai/Mistral-7B-v0.1",
                "bge-small":"https://api-inference.huggingface.co/models/BAAI/bge-small-en","bge-large":"https://api-inference.huggingface.co/models/BAAI/bge-large-en-v1.5"}#{name:api_base}

for name, end_point in HF_endpoints.items():
    connection =ServerlessConnection(name=name,api_key=HF_KEY,api_base=end_point)
    print(f"Creating connection {connection.name}...")
    result = pf.connections.create_or_update(connection)
    print(result)

Then i run the Yaml file below Yaml file:

environment:
  python_requirements_txt: requirements.txt
inputs:
  chat_history:
    type: list
    default: []
    is_chat_input: false
    is_chat_history: true
  question:
    type: string
    default: What can you tell me about your jackets?
    is_chat_input: true
    is_chat_history: false
  customerId:
    type: string
    default: "2"
    is_chat_input: false
    is_chat_history: false
outputs:
  answer:
    type: string
    reference: ${llm_response.output}
    is_chat_output: true
  context:
    type: string
    reference: ${retrieve_documentation.output}
nodes:
- name: question_embedding
  type: python
  source:
    type: package
    tool: promptflow.tools.embedding.embedding
  inputs:
    connection: aoai-connection
    input: ${inputs.question}
    deployment_name: text-embedding-ada-002
  aggregation: false
- name: retrieve_documentation
  type: python
  source:
    type: code
    path: retrieve_documentation.py
  inputs:
    question: ${inputs.question}
    index_name: contoso-products
    embedding: ${question_embedding.output}
    search: contoso-search
- name: customer_lookup
  type: python
  source:
    type: code
    path: customer_lookup.py
  inputs:
    customerId: ${inputs.customerId}
    conn: contoso-cosmos
- name: customer_prompt
  type: prompt
  source:
    type: code
    path: customer_prompt.jinja2
  inputs:
    documentation: ${retrieve_documentation.output}
    customer: ${customer_lookup.output}
    history: ${inputs.chat_history}
- name: llm_response
  type: llm
  source:
    type: code
    path: llm_response.jinja2
  inputs:
    deployment_name: gpt-35-turbo
    prompt_text: ${customer_prompt.output}
    question: ${inputs.question}
  connection: Mixtral7B
  api: chat

2024-06-12 12:30:48 +0100 74066 execution WARNING [llm_response in line 0 (index starts from 0)] stderr> Exception occurs: UnprocessableEntityError: Error code: 422 - {'error': 'Template error: template not found', 'error_type': 'template_error'} 2024-06-12 12:30:48 +0100 74066 execution WARNING [llm_response in line 0 (index starts from 0)] stderr> UnprocessableEntityError #4, but no Retry-After header, Back off 18 seconds for retry.