Closed dmille closed 1 year ago
Hi @dmille Thanks for reaching out. I did the experiment and yes the way you are defining the nested field in the pipeline won't work. But the pipeline supports nested fields. To do that please try to create pipeline like this:
PUT /neural-test-index-nested
{
"description": "Neural Search Pipeline for message content",
"processors": [
{
"text_embedding": {
"model_id": "SXXx8YUBR2ZWhVQIkghB",
"field_map": {
"message": {
"text": "message_embedding"
}
}
}
}
]
}
The thing is right now TextEmbedding processor doesn't understand "." operator as a nested field operator. I did some test on my side and the above way of creating the processor will work and it will handle the nested fields.
I think this can be something which Plugin can support. I will create a feature request for this feature.
@navneet1v Thanks for the prompt reply! This fixed my problem.
I am closing this issue and I have created this new GH issue: https://github.com/opensearch-project/neural-search/issues/110 for tracking.
What is the bug?
When defining a field_map containing nested fields, the pipeline fails to compute embeddings.
How can one reproduce the bug?
With the following configuration, using non-nested field-types, embeddings are computed:
With the following configuration using a nested source field, embeddings are not computed:
What is the expected behavior?
The neural ingestion pipeline should be able to handle nested fields.
What is your host/environment?
docker image: opensearchproject/opensearch:2.5.0
Do you have any additional context?
The models referenced above were uploaded with the following configuration: