DAMO-NLP-SG / LLM-R2

35 stars 3 forks source link

Inquiry on Guaranteeing Rewrite Equivalence and Request for Rewritten Results #3

Open kr11 opened 1 month ago

kr11 commented 1 month ago

Thank you for your valuable work!

About Equivalence Could you explain how LLM-R2 ensures that the rewritten SQL is equivalent to the original SQL? We have observed that Calcite sometimes produces non-equivalent outcomes for MySQL. I am unfamiliar with its performance on PostgreSQL.

About rewritten results LLM-R2 shows impressive improvements on TPCH, IMDB, and DSB benchmarks, yet the rewritten SQLs seem absent from the repository. Could you please upload the rewritten SQLs for the TPCH benchmarks? Your assistance would be highly appreciated.

LZ12DH commented 1 week ago

Hi,

Thank you for your feedback!

It is interesting to know that Calcite's rewrite tool may produce non-equivalent outcomes for MySQL. We did not perform any experiments on MySQL and we would appreciate it if you could share some of such examples. By far in PostgreSQL we assume that Calcite will always provide equivalent rewrites and by far we did not observe any non-equivalent cases. I think this is also true for the Learned Rewrite paper [1].

For rewritten results, you may check out the 'data/data_llmr2/pools' folder. Some of our rewrite examples on all three datasets can be found there.

[1] Zhou et al., A Learned Query Rewrite System using Monte Carlo Tree Search, VLDB 2022