apache / incubator-gluten

Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
https://gluten.apache.org/
Apache License 2.0
1.22k stars 437 forks source link

[CORE] Remove TPC-H test dataset from module backends-velox, use the one from gluten-core uniformly #8004

Open zhztheplayer opened 2 days ago

zhztheplayer commented 2 days ago

Use gluten-core/test/resources/tpch-data-sf100 uniformly in tests where tpch dataset is needed.

github-actions[bot] commented 2 days ago

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/apache/incubator-gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

github-actions[bot] commented 2 days ago

Run Gluten Clickhouse CI on x86

github-actions[bot] commented 2 days ago

Run Gluten Clickhouse CI on x86

zhztheplayer commented 1 day ago

@zzcclp This is the attempt to use Parquet data in gluten-core from backend-velox's test code. But it seems that the resource files in core jar cannot be accessed from other modules.

I am thinking to introduce an env var like GLUTEN_HOME to pass to test suites to avoid specifying relevant paths in other modules. otherwise the code will rely on relative path between modules which is fragile and tricky to maintain.