apache / arrow

Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
https://arrow.apache.org/
Apache License 2.0
14.49k stars 3.52k forks source link

[C++] Substarit End-To-End Tests for Relations #32721

Open asfimport opened 2 years ago

asfimport commented 2 years ago

At the moment the test coverage for Substrait integration covers the functional tests for serializing and deserializing. But it lacks test for running end-to-end functional tests which proves whether a Substrait plan can deliver the expected outcome. As a part of this, for each relation (Read, Filter, Project, Join, Aggregate) must have end-to-end tests covering the options associated with each relation.

Reporter: Vibhatha Lakmal Abeykoon / @vibhatha

Subtasks:

Note: This issue was originally created as ARROW-17457. Please see the migration documentation for further details.

asfimport commented 2 years ago

Weston Pace / @westonpace: It is a bit spread out but we have some end-to-end tests on some of these features. There are a few different round trip tests (and your examples) which test the read relation. The tests in arrow/engine/substrait/function_test.cc do round trip testing for Project and Aggregate. We do not have any round trip tests that I know of for Filter or Join.

asfimport commented 2 years ago

Vibhatha Lakmal Abeykoon / @vibhatha: So it would be better to improve the function_test.cc interface for this.

I also have something in mind with a write-up on developer's perspective. May be include some of the choices we have made. I also find that, having more Python tests could cover a lot of ground in terms of various functional tests. Python tests would be roubust and easy to adopt for high level end-to-end test cases.

asfimport commented 1 year ago

Apache Arrow JIRA Bot: This issue was last updated over 90 days ago, which may be an indication it is no longer being actively worked. To better reflect the current state, the issue is being unassigned per project policy. Please feel free to re-take assignment of the issue if it is being actively worked, or if you plan to start that work soon.