Closed ulysses-you closed 9 months ago
cc @zhztheplayer @zhouyuan @PHILO-HE
@ulysses-you thanks, I ran int this in some unit tests with the default memory manager used. The TPC-H/DS runs ok w/o this patch.
https://github.com/oap-project/gluten/blob/main/cpp/velox/memory/VeloxMemoryManager.h#L84
@zhztheplayer do you see the RCA why the memory manager is not initialized?
thanks, -yuan
Somewhere in Velox still requests for the default memory pool. In the case facebook::velox::memory::spillMemoryPool()
.
Another options is to see if we could pass a context pool for spilling use, but that can be another topic to me.
@zhztheplayer the spill memory pool should also use "the same" memory manager instance. I was wondering why tpch/ds does not fail if the memory manager singleton instance is not registered.
@zhztheplayer the spill memory pool should also use "the same" memory manager instance. I was wondering why tpch/ds does not fail if the memory manager singleton instance is not registered.
It's because Velox still has some code calling a function named deprecatedDefaultMemoryManager
which sets the memory manager if missing. Not sure why the function not called in this issue's case. @ulysses-you if you could help investigating? I see the memory manager is being set via the following call stack:
facebook::velox::memory::instance Memory.cpp:34
facebook::velox::memory::MemoryManager::deprecatedGetInstance Memory.cpp:107
facebook::velox::memory::deprecatedDefaultMemoryManager Memory.cpp:290
facebook::velox::core::QueryCtx::initPool QueryCtx.h:139
facebook::velox::core::QueryCtx::QueryCtx QueryCtx.cpp:36
Java_io_glutenproject_vectorized_PlanEvaluatorJniWrapper_nativeValidateWithFailureReason VeloxJniWrapper.cc:109
@zhztheplayer it seems the nativeValidateWithFailureReason
is only called at driver ?
@zhztheplayer it seems the
nativeValidateWithFailureReason
is only called at driver ?
Aha correct. I was testing with local mode. Probably we should add some for local-cluster mode later.
Is it a new error intorduced by rebase?
@FelixYBW it is intriduced since 12.18 rebase
Backend
VL (Velox)
Bug description
It seems the issue is caused by https://github.com/facebookincubator/velox/pull/7168
Spark version
None
Spark configurations
No response
System information
No response
Relevant logs
No response