apache / incubator-gluten

Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
https://gluten.apache.org/
Apache License 2.0
1.22k stars 439 forks source link

[GLUTEN-7837][VL] Spark driver should not initialize cache if not in local mode #7853

Closed leoluan2009 closed 2 weeks ago

leoluan2009 commented 3 weeks ago

What changes were proposed in this pull request?

Fixes: https://github.com/apache/incubator-gluten/issues/7837

Spark driver should not initialize cache if not in local mode

How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)

(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

github-actions[bot] commented 3 weeks ago

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/apache/incubator-gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

leoluan2009 commented 3 weeks ago

@zhztheplayer can you help review?

github-actions[bot] commented 3 weeks ago

https://github.com/apache/incubator-gluten/issues/7837

zhztheplayer commented 3 weeks ago

@leoluan2009 Does this solve the cache issue mentioned in https://github.com/apache/incubator-gluten/issues/7837?

leoluan2009 commented 3 weeks ago

@leoluan2009 Does this solve the cache issue mentioned in #7837?

Yes, if VeloxBackend is running in driver, it will not init cache.

zhztheplayer commented 2 weeks ago

Hi @leoluan2009 , as spark.gluten.sql.columnar.backend.velox.cacheEnabled is now a static SQL conf, can we just modify the config map passed to native to set it to false from non-local driver side? It's a workaround and we can remove once a optimal design is out.

leoluan2009 commented 2 weeks ago

Hi @leoluan2009 , as spark.gluten.sql.columnar.backend.velox.cacheEnabled is now a static SQL conf, can we just modify the config map passed to native to set it to false from non-local driver side? It's a workaround and we can remove once a optimal design is out.

it's ok and I will update the pr

leoluan2009 commented 2 weeks ago

@zhztheplayer I have updated this PR, please help to review, thanks!

leoluan2009 commented 2 weeks ago

@zhztheplayer resolved all comments

FelixYBW commented 2 weeks ago

@zhztheplayer @leoluan2009 can you go through all the initializations there? Is there any other things we shouldn't run on driver? Is the local cache only exception? I'd expect the initialization should be much different.

If we have more than one, we should considered to seperate the initialization routine for driver and worker.

zhztheplayer commented 2 weeks ago

can you go through all the initializations there? Is there any other things we shouldn't run on driver? Is the local cache only exception? I'd expect the initialization should be much different.

Not much difference so far and it's also about our C++ API's encapsulation. We can make trade-offs once it's really needed.

zhztheplayer commented 2 weeks ago

If we have more than one, we should considered to seperate the initialization routine for driver and worker.

In future there will be different ways to implement differentiated initialization, but exposing Spark arch conceptions "driver" / "executor" / "task" to C++ code could be the last ones to consider. We can use configurations, or compose finer-grained JNI initialization APIs, etc.

FelixYBW commented 1 week ago

In future there will be different ways to implement differentiated initialization, but exposing Spark arch conceptions "driver" / "executor" / "task" to C++ code could be the last ones to consider. We can use configurations, or compose finer-grained JNI initialization APIs, etc.

Let's re-evaluate the mythology once we have another exception.