vespa-engine / vespa

AI + Data, online. https://vespa.ai
https://vespa.ai
Apache License 2.0
5.59k stars 586 forks source link

Vespa as autocomplete: "Slow" responses for single character queries #25333

Open tkaessmann opened 1 year ago

tkaessmann commented 1 year ago

Is your feature request related to a problem? Please describe. We're using vespa as our autosuggest engine in one of Germany's largest e-commerce online shop. It's performing very good in most of the cases, but EVERY user in an autosuggest starts to formulate his query with a single characters ("s", "a", "c", ...). On queries with only one character the response times are often > 30 ms (indexing: summary | attribute, attribute: fast-search, rank: filter is set). Which is ok, but could be better. Our old implementation was based on a Pruning Radix Trie and in that case it was faster because of early termination. Our query is simple: select foo from bar where query contains ({prefix:true}"s")

Describe the solution you'd like Maybe it's an option to add a Trie as an attribute datastructure. In an autocomplete context they might perform better. Or do you have other ideas to fix this problem?

Describe alternatives you've considered In Lucene I know that they're using FST's (or Weighted-FST's) as an autocomplete datastructure. Maybe this could be the better option, because from my understanding they are smaller than Trie's. And I saw that there is already an FSA available in Vespa.

Additional context If the implementation of a new datastructure is possible in Java / Kotlin, we could also collaborate on this!

bjormel commented 1 year ago

Thanks for the kind words, we really appreciate to get this kind of feedback.

Can you test https://docs.vespa.ai/en/graceful-degradation.html, with a timeout set a lower value?

Since the number of matching documents are many, I assume that you or ok with not considering them all.

tkaessmann commented 1 year ago

This sounds like that this is important in a cloud setup, correct? We're only having a single node setup.

bjormel commented 1 year ago

Then you could give https://docs.vespa.ai/en/graceful-degradation.html#match-phase-degradation, https://docs.vespa.ai/en/reference/schema-reference.html#match-phase, limiting the max hits a content node should attempt to produce in the match phase

tkaessmann commented 1 year ago

The solution mentioned there leads now to timings between 25-28ms in our setup (after playing a little bit around with config parameters). Thanks! And it's already deployed on production. With two characters the timings will drop to 5-8ms. It's getting better :)

tkaessmann commented 1 year ago

Had also a chat with @jobergum and he mentioned the following solution. With that solution the timings for single character queries drops to ~9ms. So this topic is now done from my side ✅

bjormel commented 1 year ago

That is great, thanks!

Happy holidays

jobergum commented 1 year ago

Keeping this as a feature request for implementing trie data structures for faster prefix search. The current prefix implementation finds all matching word postings through dictionary traversal.

For short inputs, a trick is to use range queries with hitLimit on a fast-search attribute. This changes the semantics of the prefix query to only match against documents in the top 1K, which is usually what one wants for short pre-fix lengths.

select foo from bar where (query contains ({prefix:true}"s")) and ({hitLimit:1000,descending:true}range(most_dominant_field_in_the_ranking_expression,0,Infinity))