deepset-ai / hayhooks

Deploy Haystack pipelines behind a REST Api.
https://haystack.deepset.ai
Apache License 2.0
30 stars 8 forks source link

Multiplexer value cannot be left empty in API call #18

Open IronD7 opened 1 month ago

IronD7 commented 1 month ago

Describe the bug I've serialized a pipeline that sets a multiplexer in the course of the pipeline. When trying to run the pipeline via hayhooks, I get an error message that the multiplexer value is missing. Minimum pipeline to reproduce the error:

from haystack import Pipeline
from haystack.testing.sample_components.add_value import AddFixedValue
from haystack.components.others import Multiplexer
pipe = Pipeline()
pipe.add_component("add", AddFixedValue(add=11))
pipe.add_component("multiplexer", Multiplexer(int))
pipe.connect("add", "multiplexer.value")
with open("./test_multiplexer.yaml", "w") as f:
    pipe.dump(f)

grafik

Running the pipeline in Python works well:

pipe.run({"add": {"value": 11}})

> {'multiplexer': {'value': 22}}

Now when calling hayhooks like this:

curl -X 'POST' 'http://localhost:1416/test_multiplexer' 
-H 'accept: application/json'
-H 'Content-Type: application/json'
-d '{
  "add": {
    "value": 11
  },
  "multiplexer": {}           
}'

I get below Error message that multiplexer value is required. Setting the value in the multiplexer does not work as it would lead to multiple inputs and a ValueError.

Error message

{"detail":[{"type":"missing","loc":["body","multiplexer","value"],"msg":"Field required","input":{}}]}

Expected behavior Multiplexer value does not need to be set initially.

To Reproduce see above.

FAQ Check

zabeelbashir-oi commented 1 month ago

I have the same issue. I face this for Document Joiner. The FastAPI pydantic model should not expect Document Joiner at all.

image

My FastAPI Request Body:

{
  "bm25_retriever": {
    "query": "string",
    "filters": {},
    "top_k": 0
  },
  "document_joiner": {
    "documents": [
      {
        "id": "string",
        "content": "string",
        "dataframe": {},
        "blob": {
          "data": "string",
          "meta": {},
          "mime_type": "string"
        },
        "meta": {},
        "score": 0,
        "embedding": [
          0
        ],
        "sparse_embedding": {
          "indices": [
            0
          ],
          "values": [
            0
          ]
        }
      }
    ]
  },
  "embedding_retriever": {
    "filters": {},
    "top_k": 0
  },
  "ranker": {
    "query": "string",
    "top_k": 0,
    "scale_score": true,
    "calibration_factor": 0,
    "score_threshold": 0
  },
  "text_embedder": {
    "text": "string"
  }
}

My pipeline code for reference:

hybrid_retrieval = Pipeline()
        hybrid_retrieval.add_component(
            "text_embedder",
            OpenAITextEmbedder(
                api_key=Secret.from_token(
                    "sk-proj-XXX"
                ),
                model="text-embedding-3-small",
            ),
        )
hybrid_retrieval.add_component("tracer", LangfuseConnector("Search Pipeline"))
hybrid_retrieval.add_component("embedding_retriever", embedding_retriever)
hybrid_retrieval.add_component("bm25_retriever", bm25_retriever)
hybrid_retrieval.add_component("document_joiner", document_joiner)
hybrid_retrieval.add_component("ranker", ranker)

hybrid_retrieval.connect(
"text_embedder.embedding", "embedding_retriever.query_embedding"
)
hybrid_retrieval.connect("bm25_retriever", "document_joiner")
hybrid_retrieval.connect("embedding_retriever", "document_joiner")
hybrid_retrieval.connect("document_joiner", "ranker")

Pipeline YAML

components:
  bm25_retriever:
    init_parameters:
      document_store:
        init_parameters:
          custom_mapping: null
          embedding_similarity_function: cosine
          hosts: XXXX
          http_auth: &id001 !!python/tuple
          - XXXX
          - XXXX
          index: nomia_matching_index
        type: haystack_integrations.document_stores.elasticsearch.document_store.ElasticsearchDocumentStore
      filters: {}
      fuzziness: AUTO
      scale_score: false
      top_k: 10
    type: haystack_integrations.components.retrievers.elasticsearch.bm25_retriever.ElasticsearchBM25Retriever
  document_joiner:
    init_parameters:
      join_mode: concatenate
      sort_by_score: true
      top_k: null
      weights: null
    type: haystack.components.joiners.document_joiner.DocumentJoiner
  embedding_retriever:
    init_parameters:
      document_store:
        init_parameters:
          custom_mapping: null
          embedding_similarity_function: cosine
          hosts: XXXX
          http_auth: *id001
          index: nomia_matching_index
        type: haystack_integrations.document_stores.elasticsearch.document_store.ElasticsearchDocumentStore
      filters: {}
      num_candidates: null
      top_k: 10
    type: haystack_integrations.components.retrievers.elasticsearch.embedding_retriever.ElasticsearchEmbeddingRetriever
  ranker:
    init_parameters:
      calibration_factor: 1.0
      device: null
      document_prefix: ''
      embedding_separator: '

        '
      meta_fields_to_embed: []
      model: cross-encoder/ms-marco-MiniLM-L-6-v2
      model_kwargs:
        device_map: mps
      query_prefix: ''
      scale_score: true
      score_threshold: null
      token:
        env_vars:
        - HF_API_TOKEN
        strict: false
        type: env_var
      top_k: 10
    type: haystack.components.rankers.transformers_similarity.TransformersSimilarityRanker
  text_embedder:
    init_parameters:
      api_base_url: null
      api_key:
        env_vars:
        - OPENAI_API_KEY
        strict: true
        type: env_var
      dimensions: null
      model: text-embedding-3-small
      organization: null
      prefix: ''
      suffix: ''
    type: haystack.components.embedders.openai_text_embedder.OpenAITextEmbedder
connections:
- receiver: embedding_retriever.query_embedding
  sender: text_embedder.embedding
- receiver: document_joiner.documents
  sender: embedding_retriever.documents
- receiver: document_joiner.documents
  sender: bm25_retriever.documents
- receiver: ranker.documents
  sender: document_joiner.documents
max_loops_allowed: 100
metadata: {}