Open Zhanxiao-Ma opened 2 weeks ago
In my production environment, I have observed that long-running Spark rewrite Files Action can lead to OutOfMemoryError. Analyze the Java dump, I noticed a large number of ChildAllocator objects that are only referenced by the RootAllocator. Upon reviewing the code, I discovered that the ChildAllocator allocated at this point is indeed not being released. Is this correct? https://github.com/apache/iceberg/blob/cbb853073e681b4075d7c8707610dceecbee3a82/arrow/src/main/java/org/apache/iceberg/arrow/vectorized/VectorizedReaderBuilder.java#L59
I think the close method of VectorizedArrowReader should include the logic to release rootAlloc.
In my production environment, I have observed that long-running Spark rewrite Files Action can lead to OutOfMemoryError. Analyze the Java dump, I noticed a large number of ChildAllocator objects that are only referenced by the RootAllocator. Upon reviewing the code, I discovered that the ChildAllocator allocated at this point is indeed not being released. Is this correct? https://github.com/apache/iceberg/blob/cbb853073e681b4075d7c8707610dceecbee3a82/arrow/src/main/java/org/apache/iceberg/arrow/vectorized/VectorizedReaderBuilder.java#L59