that learns a table’s
data distribution while fully removing heuristic modeling assumptions for the first time, by
applying and enhancing a new statistical model from recent advances in self-supervised learning. Like classical synopses, Naru directly summarizes the data and then uses the summary
to estimate the cardinalities of incoming queries or predicates. Unlike previous estimators,
Naru approximates the joint data distribution of a table without any independence assumptions, thereby achieving a new level of accuracy in base table cardinality estimation.
Summary
Using ML to enhance the table cardinality.
the Naru poejction at https://github.com/naru-project/naru
ref: Pandas C++: https://github.com/hosseinmoein/DataFrame
ref to :
EECS-2022-194 Machine Learning for Query Optimization
.