The model is an initial implementation to predict selectivity for range predicates.
It can be applied to queries like:
SELECT * FROM table WHERE c >= l AND c <= u.
I implement the model in augmentedNN.py and cpp wrapper code in augmentedNN.cpp, taking LSTM.py and LSTM.cpp as a reference.
Hyperparameters, especially number of training epochs, need to be discussed based on real system experiments.
Test cases for the model are also added. The test cases include a uniform distribution dataset and a skewed distribution dataset.
There are two classes defined.
class AugmentedNN (in augmentedNN.cpp). This class is just like class TimeSeriesLSTM.
Fit(): applies backpropagation.
Predict(): returns the predictions for the input.
TrainEpoch(): trains for one epoch.
ValidateEpoch(): uses one epoch for validation.
class TestingAugmentedNNUtil (in testing_forecast_util.cpp)
GetData(): generates data for training and testing. Dataset is uniform or skewed distributed.
Test(): calls the APIs mentioned above to train and test the model.
Btw, in testing_forecast_util.cpp, the argument of matrix_eig::bottomRows was wrong. It should be the number of rows counted from the bottom of the matrix_eig. I've modified it. Please check if I am right.
Coverage decreased (-0.2%) to 76.528% when pulling dc1a0753b4d1e4734e4033aea8ef87657d45f4d7 on yetiancn:master into 1fc8b5586162afb7a2f5607256abed149f74a665 on cmu-db:master.
The model is an initial implementation to predict selectivity for range predicates. It can be applied to queries like:
SELECT * FROM table WHERE c >= l AND c <= u
.I implement the model in augmentedNN.py and cpp wrapper code in augmentedNN.cpp, taking LSTM.py and LSTM.cpp as a reference. Hyperparameters, especially number of training epochs, need to be discussed based on real system experiments. Test cases for the model are also added. The test cases include a uniform distribution dataset and a skewed distribution dataset.
There are two classes defined.
class
AugmentedNN
(in augmentedNN.cpp). This class is just like classTimeSeriesLSTM
.Fit()
: applies backpropagation.Predict()
: returns the predictions for the input.TrainEpoch()
: trains for one epoch.ValidateEpoch()
: uses one epoch for validation.class
TestingAugmentedNNUtil
(in testing_forecast_util.cpp)GetData()
: generates data for training and testing. Dataset is uniform or skewed distributed.Test()
: calls the APIs mentioned above to train and test the model.Btw, in testing_forecast_util.cpp, the argument of
matrix_eig::bottomRows
was wrong. It should be the number of rows counted from the bottom of thematrix_eig
. I've modified it. Please check if I am right.