This PR implements a search based algorithm for suggesting the best set of indexes for a given workload based on the AutoAdmin paper from Microsoft.
The main workflow and classes are given below:
The helper classes and methods for our algorithm are defined in brain/index_selection_util[.h/.c]. This includes the IndexSelectionKnobs (tunable knobs of the algorithm), HypotheticalIndexObject (a hypothetical index class), IndexConfiguration (the set of hypothetical indexes). Multiple helper methods and overloaded operators for IndexConfiguration are defined here
IndexSelection is the main wrapper around our tool which takes in a workload (set of queries) and the tunable parameters of the algorithm. It returns the best index configuration through the main external API GetBestIndexes
The methods for running the search based algorithm are present in brain/index_selection[.h/.c]. The three modules which are used for each search iteration are:
GenerateCandidateIndexes generates the admissible indexes (indexes that benefit at least one query in the workload) from the provided queries and prune the useless ones
Enumerate that gets the top k indexes for the workload which would reduce the cost of executing them through a combination of exhaustive search (ExhaustiveEnumeration) and greedy search (GreedySearch)
GenerateMultiColumnIndexes generates multi-column indexes from the single column indexes by doing a cross product and adds it into the result which will be used for the next iteration.
The WhatIfIndex in brain/what_if_index[.h/.c] returns best physical plan tree and the cost associated with a hypothetical index configuration. This is possible with the set of appropriate changes made to optimizer/optimizer[.h/.c]
The IndexSelectionContext in brain/index_selection_context[.h/.c] memoizes the cost of the query for a given configuration to reduce the number of calls sent to WhatIfAPI
Integrating this module to work with the self-driving infrastructure of Peloton (Brain) is a work in progress
This PR implements a search based algorithm for suggesting the best set of indexes for a given workload based on the AutoAdmin paper from Microsoft.
The main workflow and classes are given below:
The helper classes and methods for our algorithm are defined in brain/index_selection_util[.h/.c]. This includes the
IndexSelectionKnobs
(tunable knobs of the algorithm),HypotheticalIndexObject
(a hypothetical index class),IndexConfiguration
(the set of hypothetical indexes). Multiple helper methods and overloaded operators forIndexConfiguration
are defined hereIndexSelection
is the main wrapper around our tool which takes in a workload (set of queries) and the tunable parameters of the algorithm. It returns the best index configuration through the main external APIGetBestIndexes
The methods for running the search based algorithm are present in brain/index_selection[.h/.c]. The three modules which are used for each search iteration are:
GenerateCandidateIndexes
generates the admissible indexes (indexes that benefit at least one query in the workload) from the provided queries and prune the useless onesEnumerate
that gets the top k indexes for the workload which would reduce the cost of executing them through a combination of exhaustive search (ExhaustiveEnumeration
) and greedy search (GreedySearch
)GenerateMultiColumnIndexes
generates multi-column indexes from the single column indexes by doing a cross product and adds it into the result which will be used for the next iteration.The
WhatIfIndex
in brain/what_if_index[.h/.c] returns best physical plan tree and the cost associated with a hypothetical index configuration. This is possible with the set of appropriate changes made to optimizer/optimizer[.h/.c]The
IndexSelectionContext
in brain/index_selection_context[.h/.c] memoizes the cost of the query for a given configuration to reduce the number of calls sent to WhatIfAPIIntegrating this module to work with the self-driving infrastructure of Peloton (Brain) is a work in progress