vectara / vectara-docs

Documentation for Vectara's GenAI Platform
https://docs.vectara.com
Apache License 2.0
8 stars 13 forks source link

Add lexical/lambda default configuration for docs search #192

Open eskibars opened 11 months ago

eskibars commented 11 months ago

Idea

Technical documentation in particular tends to have a lot of terms that may have never shown up in the neural net training data. This means that not only having a non-zero value for lambda is important, but that actually it needs to be higher than average typically for the best effect. I recommend we start with a value of 0.1 for now

cc @pwoznic -- we also should actually improve the documentation around lambda. Right now it's bundled into hybrid and it's not clear how to really use it from the docs without tab-switching to the playground

cjcenizal commented 11 months ago

When we experiment with various lambda values, we can try searching for “textless”, “custom dimensions”, and “query” to see how the quality of the results changes.

eskibars commented 11 months ago

@cjcenizal yes, we can test with these, but there are even more extreme examples of things that are even rarer. e.g. searching for lxml or epub should find https://docs.vectara.com/docs/api-reference/indexing-apis/file-upload/file-upload-filetypes, 272725718 should find the MMR reranker. AdminService should find https://docs.vectara.com/docs/api-reference/admin-apis/admin