opensearch-project / neural-search

Plugin that adds dense neural retrieval into the OpenSearch ecosytem
Apache License 2.0
57 stars 58 forks source link

Hybrid search and collapse compatibility #665

Open qmauret opened 3 months ago

qmauret commented 3 months ago

Describe the bug

Using collapse feature in an hybrid search did not collapse documents.

Related component

Search

To Reproduce

I’m trying to combine hybrid search (semantic + keyword) with collapse feature to deduplicate products from same visual.

I have tried collapsed search on a basic search, which works great.

With hybrid search, the behaviour is a bit different. It places products from the same visual in the inner_hits field but did not collapse them (they are still present in the root level of the search results) which is not the expected behaviour.

Anyone’s aware of a problem of compatibility between hybrid and collapse ?

Expected behavior

I expect the same behaviour as performing a collapse on non hybrid search

Additional Details

Host/Environment (please complete the following information):

Additional context Basic search with collapse (working as expected) :

GET /product_1/_search
{
“_source”: {
“includes”: [“_id”, “name”, “category_name”, “visual.id_visual”]
},
“query”: {
“match”: {
“name”: {
“query”: “Ski”
}
}
},
“collapse”: {
“field”: “visual.id_visual”,
“inner_hits”: {
“size”: 1,
“name”: “from_same_visual”,
“sort”: [
{
“_score”: “desc”
}
]
}
}
}

Hybrid search with collapse (not working) :

GET /product_1/_search?search_pipeline=search_pipeline
{
“_source”: {
“includes”: [“_id”, “name”, “category_name”, “visual.id_visual”]
},
“query”: {
“hybrid”: {
“queries”: [
{
“neural”: {
“fullname_v”: {
“query_text”: “Ski”,
“model_id”: “xxx”,
“k”: 200
}
}
},
{
“multi_match”: {
“query”: “Ski”,
“type”: “most_fields”,
“fields”: [“category.name^2”, “name^4”, “tags.name^3”],
“fuzziness”: “AUTO”,
“prefix_length”: 0,
“max_expansions”: 10
}
}
]
}
},
“collapse”: {
“field”: “visual.id_visual”,
“inner_hits”: {
“size”: 1,
“name”: “from_same_visual”,
“sort”: [
{
“_score”: “desc”
}
]
}
}
}
peternied commented 3 months ago

[Triage - attendees 1 2 3 4 5 6 7 8] @opensearch-project/admin Could you transfer this to the neural search repository, this seems related to its functionality.

martin-gaievski commented 2 months ago

@qmauret functionality of collapse is not supported by the hybrid query. Team will look into the feasibility of adding it.