apache / pinot

Apache Pinot - A realtime distributed OLAP datastore
https://pinot.apache.org/
Apache License 2.0
5.26k stars 1.23k forks source link

Overview for "text search" topic #12427

Open hpvd opened 4 months ago

hpvd commented 4 months ago

At this time, it's a little hard to build a complete picture of the field of "text search" topic in Pinot with

even after having checked the docs

and having went through many issues and PRs....

=> Would be great to keep and maintain a good overview on this "text search" topic by e.g.

One can find several ones when searching, e.g. for

There are really many issues/pull requests and some are

A great full text search is of course complex, but imho pretty rewarding and could be a huge step for Pinot. Having a good overview, could be a valuable step...

hpvd commented 4 months ago

just a twist in POV:

Why not having in a stunning OLAP-system like Pinot

like every good Ecommerce store has for its product search??

with

  1. fast results
  2. well ranked
  3. find exact matches
  4. even finding a. incomplete words b. other spellings of words (mistakes and grammar) c. other words for the same meaning (similarity)
  5. working easily for different languages
  6. don't limit search exclusively to one column (ecommerce: search in category name, article name, article description with applied weight factors to each column)
  7. enabling set the number of results (if there were not enough results from first search, make a second one with more loose quality regulation)
  8. resorting order afterwards (based on values in other columns)
  9. slice and dice afterwards (narrow the filter/faceting, based on values in other columns)
  10. ...

    see example of a search config setup of an ecommerce backend: https://github.com/apache/pinot/issues/7218#issuecomment-1943661027

hpvd commented 4 months ago

maybe these named items of a great ecommerce search