Open zhli1142015 opened 1 year ago
We found same issue, MemoryArbitrator used in Gluten has conflict with split preload, IO executor hold Task ref and may released after ExecutionCtx dtor, that Task ref contains shared ref of MemoryPool from QueryCtx.
Actually, it not related with q49, every query has possibility to reproduce.
We found same issue, MemoryArbitrator used in Gluten has conflict with split preload, IO executor hold Task ref and may released after ExecutionCtx dtor, that Task ref contains shared ref of MemoryPool from QueryCtx.
Actually, it not related with q49, every query has possibility to reproduce.
Thanks @Yohahaha , could you point me where is the Task ref you mentioned, maybe we should reset that shared_ptr after data source switching.
Thanks, reset taskHolder
would slove the crash.
But memory leak and the second error are still there, then looks perload may cause real memory leak issue
I mean delete taskHolder
variable.
executor->add([split]() mutable {
split->dataSource->prepare();
split.reset();
});
could you post lastest failed log?
@Yohahaha do you know the minimize re-procude query ?
I mean delete
taskHolder
variable.executor->add([split]() mutable { split->dataSource->prepare(); split.reset(); });
could you post lastest failed log?
Yes, remove it slove all issues, thanks I think we may open an issue in velox and fix there, @Yohahaha would you like to follow up this?
@Yohahaha do you know the minimize re-procude query ?
I can reproduce with tpch q10 with sf10.
I mean delete
taskHolder
variable.executor->add([split]() mutable { split->dataSource->prepare(); split.reset(); });
could you post lastest failed log?
Yes, remove it slove all issues, thanks I think we may open an issue in velox and fix there, @Yohahaha would you like to follow up this?
I'm not sure this is root cause and can not prevent following patches wont introduce any ref again, current usage of MemoryManagement and split preload feature in Gluten breaks Velox design, Velox suppose MemoryManager is static but Gluten need it be controllable.
I would like to open an issue in Velox, but not optimistic Velox would fix it. Github does not show git message for PR4849, I copy that from git shell.
Summary:
The async prefetch split might outlive its associated task during the async prefetch.
If that happens, we might see leak in table scan memory pool we found in Prestissimo
test because of the memory allocation in the split. Considering the old task cleanup
mechanism in presto cpp, the chance is very small and shall be some extreme time race.
Thanks for your help, make sense to me.
@Yohahaha
I find reverting some code from below patch can fix this issue. the shared_ptr based memory pool is able to work with multi-thread IO pool. https://github.com/oap-project/gluten/pull/3120/files
-yuan
@zhouyuan I dont think 3120 has relation with this issue, as I mentioned above, this problem is due to current thread model and object lifecycle management, and has possibility occur with split preload which use per Spark Executor thread pool.
One possible solution is move thread pool to ExecutionCtx, let it has same lifecycle with ExecutionCtx as well as Spark Task, but it need change lots of Velox code.
If you confirm 3120 trigger this issue, I agree to revert it.
BTW, I suggest to change issue title more accurately.
@zhouyuan I dont think 3120 has relation with this issue, as I mentioned above, this problem is due to current thread model and object lifecycle management, and has possibility occur with split preload which use per Spark Executor thread pool.
One possible solution is move thread pool to ExecutionCtx, let it has same lifecycle with ExecutionCtx as well as Spark Task, but it need change lots of Velox code.
If you confirm 3120 trigger this issue, I agree to revert it.
Yes, the older commit is working on my env. no need to revert, #3120 makes sense to me. I'll check if we could do a quick fix.
thanks, -yuan
@zhouyuan I dont think 3120 has relation with this issue, as I mentioned above, this problem is due to current thread model and object lifecycle management, and has possibility occur with split preload which use per Spark Executor thread pool. One possible solution is move thread pool to ExecutionCtx, let it has same lifecycle with ExecutionCtx as well as Spark Task, but it need change lots of Velox code. If you confirm 3120 trigger this issue, I agree to revert it.
Yes, the older commit is working on my env. no need to revert, #3120 makes sense to me. I'll check if we could do a quick fix.
thanks, -yuan
@Yohahaha I tried to reproduce this in a new bigger setup, it seems the issue/fix is not able reproduce. So I think this may due to some flaky issues, sorry for this false alert.
thanks, -yuan
@zhli1142015 @Yohahaha I am proposing https://github.com/oap-project/gluten/pull/3419 which probably can fix the issue, I'll find some time to try reproducing and testing then get back to you.
Backend
VL (Velox)
Bug description
Crash happens in the second run. Two errors observed from logs:
Spark version
Spark-3.3.x
Spark configurations
"spark.sql.broadcastTimeout": "-1", "spark.executor.instances": "8", "spark.driver.memory": "56g", "spark.executor.memory": "10g", "spark.executor.cores": "8", "spark.driver.cores": "8", "spark.plugins": "io.glutenproject.GlutenPlugin", "spark.gluten.sql.columnar.backend.lib": "velox", "spark.memory.offHeap.enabled": "true", "spark.memory.offHeap.size": "53g", "spark.executor.memoryOverhead": "1024", "spark.shuffle.manager": "org.apache.spark.shuffle.sort.ColumnarShuffleManager", "spark.gluten.sql.columnar.backend.velox.SplitPreloadPerDriver": "4"
System information
No response
Relevant logs
No response