apache / datafusion

Apache DataFusion SQL Query Engine
https://datafusion.apache.org/
Apache License 2.0
5.21k stars 957 forks source link

fix dml logical plan output schema #10394

Closed leoyvens closed 1 week ago

leoyvens commented 1 week ago

Previously, LogicalPlan::schema would return the input schema for DML plans, rather than the expected output schema. It is typical for the output to be the count of rows affected by the DML statement, so the code assumes that.

See fn dml_output_schema for a test.

Which issue does this PR close?

Closes #10393.

Rationale for this change

Current behaviour is wrong.

Are these changes tested?

Yes there is a test fn dml_output_schema.

Are there any user-facing changes?

The bug being fixed is visible to users of the DataFrame API.

comphead commented 1 week ago

I checked current behavior

> CREATE TABLE test (x int)
;
0 row(s) fetched. 
Elapsed 0.017 seconds.

> INSERT INTO test VALUES (1);
+-------+
| count |
+-------+
| 1     |
+-------+
1 row(s) fetched. 
Elapsed 0.015 seconds.

> INSERT INTO test VALUES (3);
+-------+
| count |
+-------+
| 1     |
+-------+
1 row(s) fetched. 
Elapsed 0.005 seconds.

I'm not even sure why we output anything after DDL/DML... I'd probably prefer no output like in duckdb

D create table x (id int);
D insert into x values(1);
leoyvens commented 1 week ago

@comphead Thank you for your review, I have addressed the outstanding comments. I have no opinion on desired behaviour, this PR is just trying to make things consistent.