Closed leoluan2009 closed 2 weeks ago
Thanks for opening a pull request!
Could you open an issue for this pull request on Github Issues?
https://github.com/apache/incubator-gluten/issues
Then could you also rename commit message and pull request title in the following format?
[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}
See also:
@zhztheplayer can you help review?
@leoluan2009 Does this solve the cache issue mentioned in https://github.com/apache/incubator-gluten/issues/7837?
@leoluan2009 Does this solve the cache issue mentioned in #7837?
Yes, if VeloxBackend is running in driver, it will not init cache.
Hi @leoluan2009 , as spark.gluten.sql.columnar.backend.velox.cacheEnabled
is now a static SQL conf, can we just modify the config map passed to native to set it to false from non-local driver side? It's a workaround and we can remove once a optimal design is out.
Hi @leoluan2009 , as
spark.gluten.sql.columnar.backend.velox.cacheEnabled
is now a static SQL conf, can we just modify the config map passed to native to set it to false from non-local driver side? It's a workaround and we can remove once a optimal design is out.
it's ok and I will update the pr
@zhztheplayer I have updated this PR, please help to review, thanks!
@zhztheplayer resolved all comments
@zhztheplayer @leoluan2009 can you go through all the initializations there? Is there any other things we shouldn't run on driver? Is the local cache only exception? I'd expect the initialization should be much different.
If we have more than one, we should considered to seperate the initialization routine for driver and worker.
can you go through all the initializations there? Is there any other things we shouldn't run on driver? Is the local cache only exception? I'd expect the initialization should be much different.
Not much difference so far and it's also about our C++ API's encapsulation. We can make trade-offs once it's really needed.
If we have more than one, we should considered to seperate the initialization routine for driver and worker.
In future there will be different ways to implement differentiated initialization, but exposing Spark arch conceptions "driver" / "executor" / "task" to C++ code could be the last ones to consider. We can use configurations, or compose finer-grained JNI initialization APIs, etc.
In future there will be different ways to implement differentiated initialization, but exposing Spark arch conceptions "driver" / "executor" / "task" to C++ code could be the last ones to consider. We can use configurations, or compose finer-grained JNI initialization APIs, etc.
Let's re-evaluate the mythology once we have another exception.
What changes were proposed in this pull request?
Fixes: https://github.com/apache/incubator-gluten/issues/7837
Spark driver should not initialize cache if not in local mode
How was this patch tested?
(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)