Open saicitus opened 3 years ago
Just a note: explain without analyze is likely to be inaccurate w.r.t. the actual plan, since the size of intermediate results is only taken into account at runtime. Nonetheless, we do see cases where function call joins give poor plans even with accurate counts.
Citus 10. Steps to reproduce:
Distributed Subplan 6_1 predicts 2 rows but Function Scan on read_intermediate_result intermediate_result predicts 1000 rows.
This discrepancy can generate order of magnitude (>1000 times) wrong estimates. Have observed this with a customer workload which resulted in a wrong plan (GroupAgg vs HashAgg) which resulted in order of magnitude performance deterioration (40 mins vs 4 mins)