trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
10.35k stars 2.98k forks source link

Add stats to tpch, tpcds for other scale factors #219

Open findepi opened 5 years ago

findepi commented 5 years ago

tpch (and sometimes tpcds) are used to try out CBO. however, both provide stats for some scale factors only (currently: sf0.01, sf1, sf3000 in both cases).

mosabua commented 1 year ago

Hm ... to some degree this is completely misleading to have stats .. the connector is more of a data generator than at actual data source. It enables usage of the tpch connector for CBO testing and benchmarking directly, which can be misleading. In either case .. I think the right approach would be to remove all stats, or provide stats for all scale factors.

Beside all that .. do you think we should document the current behavior.. I think yes. If you agree .. is the list of scale factors with stats from the description correct?