rheem-ecosystem / rheem

Rheem - a cross-platform data processing system
https://rheem-ecosystem.github.io
5 stars 0 forks source link

Replace or Override JdbcTableSource::CardinalityEstimator in concrete platform equivalent #50

Open luckyasser opened 7 years ago

luckyasser commented 7 years ago

JdbcTableSource's CardinalityEstimator issues a "count(*)" statement which is both expensive and often unnecessary. Most dbms's keep some metadata tables/functions that immediately return an estimate for the cardinality of a table, since there's no real need for knowing the exact number of records. For example in postgres you can do: "SELECT reltuples AS approximate_row_count FROM pg_class WHERE relname = 'table_name';"