illinoisdata / DeepOLA

7 stars 5 forks source link

Set cardinality fraction to 1.0 on last input file #150

Closed nikhil96sher closed 1 year ago

nikhil96sher commented 1 year ago

The cardinality of lineitem table as 6_000_000 * SF is approximate. For some scales, the actual cardinality is smaller than this value (by a very small number). Owing to this, the cardinality ratio in the last block is not 1.0 and hence the final result is still approximate (since scaling is performed). Currently, fixed it by setting the cardinality of the last dataframe read to be 1.0.

Let me know if you have other suggestions!

nikhil96sher commented 1 year ago

@mIXs222 Can you review this and other wanderjoin query related PR(s).

mIXs222 commented 1 year ago

Also i'll merge those wanderjoin PRs together then. Thanks for taking over them!

Nvm, seems like you could initiate the merges from this PR