Open IronD7 opened 1 month ago
I have the same issue. I face this for Document Joiner. The FastAPI pydantic model should not expect Document Joiner at all.
My FastAPI Request Body:
{
"bm25_retriever": {
"query": "string",
"filters": {},
"top_k": 0
},
"document_joiner": {
"documents": [
{
"id": "string",
"content": "string",
"dataframe": {},
"blob": {
"data": "string",
"meta": {},
"mime_type": "string"
},
"meta": {},
"score": 0,
"embedding": [
0
],
"sparse_embedding": {
"indices": [
0
],
"values": [
0
]
}
}
]
},
"embedding_retriever": {
"filters": {},
"top_k": 0
},
"ranker": {
"query": "string",
"top_k": 0,
"scale_score": true,
"calibration_factor": 0,
"score_threshold": 0
},
"text_embedder": {
"text": "string"
}
}
My pipeline code for reference:
hybrid_retrieval = Pipeline()
hybrid_retrieval.add_component(
"text_embedder",
OpenAITextEmbedder(
api_key=Secret.from_token(
"sk-proj-XXX"
),
model="text-embedding-3-small",
),
)
hybrid_retrieval.add_component("tracer", LangfuseConnector("Search Pipeline"))
hybrid_retrieval.add_component("embedding_retriever", embedding_retriever)
hybrid_retrieval.add_component("bm25_retriever", bm25_retriever)
hybrid_retrieval.add_component("document_joiner", document_joiner)
hybrid_retrieval.add_component("ranker", ranker)
hybrid_retrieval.connect(
"text_embedder.embedding", "embedding_retriever.query_embedding"
)
hybrid_retrieval.connect("bm25_retriever", "document_joiner")
hybrid_retrieval.connect("embedding_retriever", "document_joiner")
hybrid_retrieval.connect("document_joiner", "ranker")
Pipeline YAML
components:
bm25_retriever:
init_parameters:
document_store:
init_parameters:
custom_mapping: null
embedding_similarity_function: cosine
hosts: XXXX
http_auth: &id001 !!python/tuple
- XXXX
- XXXX
index: nomia_matching_index
type: haystack_integrations.document_stores.elasticsearch.document_store.ElasticsearchDocumentStore
filters: {}
fuzziness: AUTO
scale_score: false
top_k: 10
type: haystack_integrations.components.retrievers.elasticsearch.bm25_retriever.ElasticsearchBM25Retriever
document_joiner:
init_parameters:
join_mode: concatenate
sort_by_score: true
top_k: null
weights: null
type: haystack.components.joiners.document_joiner.DocumentJoiner
embedding_retriever:
init_parameters:
document_store:
init_parameters:
custom_mapping: null
embedding_similarity_function: cosine
hosts: XXXX
http_auth: *id001
index: nomia_matching_index
type: haystack_integrations.document_stores.elasticsearch.document_store.ElasticsearchDocumentStore
filters: {}
num_candidates: null
top_k: 10
type: haystack_integrations.components.retrievers.elasticsearch.embedding_retriever.ElasticsearchEmbeddingRetriever
ranker:
init_parameters:
calibration_factor: 1.0
device: null
document_prefix: ''
embedding_separator: '
'
meta_fields_to_embed: []
model: cross-encoder/ms-marco-MiniLM-L-6-v2
model_kwargs:
device_map: mps
query_prefix: ''
scale_score: true
score_threshold: null
token:
env_vars:
- HF_API_TOKEN
strict: false
type: env_var
top_k: 10
type: haystack.components.rankers.transformers_similarity.TransformersSimilarityRanker
text_embedder:
init_parameters:
api_base_url: null
api_key:
env_vars:
- OPENAI_API_KEY
strict: true
type: env_var
dimensions: null
model: text-embedding-3-small
organization: null
prefix: ''
suffix: ''
type: haystack.components.embedders.openai_text_embedder.OpenAITextEmbedder
connections:
- receiver: embedding_retriever.query_embedding
sender: text_embedder.embedding
- receiver: document_joiner.documents
sender: embedding_retriever.documents
- receiver: document_joiner.documents
sender: bm25_retriever.documents
- receiver: ranker.documents
sender: document_joiner.documents
max_loops_allowed: 100
metadata: {}
Describe the bug I've serialized a pipeline that sets a multiplexer in the course of the pipeline. When trying to run the pipeline via hayhooks, I get an error message that the multiplexer value is missing. Minimum pipeline to reproduce the error:
Running the pipeline in Python works well:
Now when calling hayhooks like this:
I get below Error message that multiplexer value is required. Setting the value in the multiplexer does not work as it would lead to multiple inputs and a ValueError.
Error message
Expected behavior Multiplexer value does not need to be set initially.
To Reproduce see above.
FAQ Check