dominictarr / mynosql

MIT License
44 stars 5 forks source link

(artificially) intelligent index selection #2

Open dominictarr opened 9 years ago

dominictarr commented 9 years ago

given a series of indexes and a query, there may be multiple ways to execute a query. It would be hard to tell a priori which is the best query plan to use, but you could apply a K-armed bandit.

If you hashed the parameters of the each query it may be unique, but if you removed the parameters, and kept only the operators and field names, you could hash that, and then track the performance of each strategy for queries with that shape.

This way, given a set of indexes, you could make good decisions about which one to use... without the user actually having to configure anything.

dominictarr commented 9 years ago

The thing here, is to tradeoff diskspace (/write bandwidth) to improve query time. Not all indexes are created equally. There are some indexes which won't help much, some that will help a lot. You need to index fields that a) are used in queries, and b) have a larger distribution... fields that take on a small number of values are not so helpful.

dominictarr commented 9 years ago

Also, fields that are uncommon might not be worth indexing.... The best fields to index, would be paths that are nearly always there, and usually different.