apache / incubator-gluten

Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
https://gluten.apache.org/
Apache License 2.0
1.22k stars 437 forks source link

[CH] `GraceHashJoin` is easy to cause OOM #8003

Open lgbo-ustc opened 2 days ago

lgbo-ustc commented 2 days ago

Backend

CH (ClickHouse)

Bug description

There is a problem in controlling the memory usage of grace hash join. It uses a fixed bytes limit. This cannot trigger the spill of hash table adaptively.

Spark version

None

Spark configurations

No response

System information

No response

Relevant logs

No response

lgbo-ustc commented 2 days ago

We track the memory usage of hash table in the join

2024-11-20 11:33:18.923 <Error> GraceHashJoin: xxx 0x7f41ebf6c340 total_rows: 577278, total_bytes: 781479072
2024-11-20 11:33:18.924 <Error> GraceHashJoin: xxx 0x7f41e5ad8840 total_rows: 569546, total_bytes: 772995144
2024-11-20 11:33:18.934 <Error> GraceHashJoin: xxx 0x7f41ebf6c340 total_rows: 581310, total_bytes: 786001384
2024-11-20 11:33:18.939 <Error> GraceHashJoin: xxx 0x7f41e5ad8840 total_rows: 573578, total_bytes: 777517456
2024-11-20 11:33:18.946 <Error> GraceHashJoin: xxx 0x7f41ebf6c340 total_rows: 585358, total_bytes: 790524104
2024-11-20 11:33:18.951 <Error> GraceHashJoin: xxx 0x7f41e5ad8840 total_rows: 577610, total_bytes: 782039768
2024-11-20 11:33:18.957 <Error> GraceHashJoin: xxx 0x7f41ebf6c340 total_rows: 589390, total_bytes: 795046416
2024-11-20 11:33:18.962 <Error> GraceHashJoin: xxx 0x7f41e5ad8840 total_rows: 581642, total_bytes: 786562080
2024-11-20 11:33:18.968 <Error> GraceHashJoin: xxx 0x7f41ebf6c340 total_rows: 593422, total_bytes: 799568728
2024-11-20 11:33:18.972 <Error> GraceHashJoin: xxx 0x7f41e5ad8840 total_rows: 585674, total_bytes: 791084392
2024-11-20 11:33:18.973 <Debug> MemoryTracker: Current memory usage: 3.00 GiB.
2024-11-20 11:33:18.979 <Error> GraceHashJoin: xxx 0x7f41ebf6c340 total_rows: 597454, total_bytes: 804091040
2024-11-20 11:33:18.983 <Error> GraceHashJoin: xxx 0x7f41e5ad8840 total_rows: 589706, total_bytes: 795606704
2024-11-20 11:33:18.989 <Error> GraceHashJoin: xxx 0x7f41ebf6c340 total_rows: 601486, total_bytes: 808613352
2024-11-20 11:33:18.995 <Error> GraceHashJoin: xxx 0x7f41e5ad8840 total_rows: 593738, total_bytes: 800129016
2024-11-20 11:33:19.000 <Error> GraceHashJoin: xxx 0x7f41ebf6c340 total_rows: 605518, total_bytes: 813135664
2024-11-20 11:33:19.006 <Error> GraceHashJoin: xxx 0x7f41e5ad8840 total_rows: 597780, total_bytes: 804651592
2024-11-20 11:33:19.010 <Error> GraceHashJoin: xxx 0x7f41ebf6c340 total_rows: 609550, total_bytes: 817657976
2024-11-20 11:33:19.017 <Error> GraceHashJoin: xxx 0x7f41e5ad8840 total_rows: 601812, total_bytes: 809173904
2024-11-20 11:33:19.029 <Error> GraceHashJoin: xxx 0x7f41e5ad8840 total_rows: 605844, total_bytes: 813696216
2024-11-20 11:33:19.130 <Error> local_engine: Enter java exception handle.
Exception Exception in thread "Executor task launch worker for task 248.0 in stage 11.0 (TID 1288)" org.apache.gluten.exception.GlutenException: Memory limit exceeded: would use 1.50 GiB (attempt to allocate chunk of 4440569 bytes), current RSS 2.79 GiB, maximum: 1.50 GiB.
0. ../contrib/llvm-project/libcxx/include/exception:141: Poco::Exception::Exception(String const&, int) @ 0x000000001469dc59
1. ./build/../src/Common/Exception.cpp:109: DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x00000000069da63c
2. ../src/Common/Exception.h:111: DB::Exception::Exception(PreformattedMessage&&, int) @ 0x00000000068ca54c
3. ../src/Common/Exception.h:129: DB::Exception::Exception<char const*, char const*, String, long&, String, String, char const*, std::basic_string_view<char, std::char_traits<char>>>(int, FormatStringHelperImpl<std::type_identity<char const*>::type, std::type_identity<char const*>::type, std::type_identity<String>::type, std::type_identity<long&>::type, std::type_identity<String>::type, std::type_identity<String>::type, std::type_identity<char const*>::type, std::type_identity<std::basic_string_view<char, std::char_traits<char>>>::type>, char const*&&, char const*&&, String&&, long&, String&&, String&&, char const*&&, std::basic_string_view<char, std::char_traits<char>>&&) @ 0x00000000069ea0c9
4. ./build/../src/Common/MemoryTracker.cpp:326: MemoryTracker::allocImpl(long, bool, MemoryTracker*, double) @ 0x00000000069e8ee1
5. ./build/../src/Common/MemoryTracker.cpp:383: MemoryTracker::allocImpl(long, bool, MemoryTracker*, double) @ 0x00000000069e8a96
6. ./build/../src/Common/CurrentMemoryTracker.cpp:64: CurrentMemoryTracker::alloc(long) @ 0x00000000069ccb1f
7. ./build/../src/Common/Allocator.cpp:233: Allocator<false, false>::realloc(void*, unsigned long, unsigned long, unsigned long) @ 0x00000000069bab7e
8. ../src/Common/PODArray.h:152: void DB::PODArrayBase<1ul, 4096ul, Allocator<false, false>, 63ul, 64ul>::resize<>(unsigned long) @ 0x0000000006a44e40
9. ./build/../src/Columns/ColumnString.cpp:156: DB::ColumnString::insertRangeFrom(DB::IColumn const&, unsigned long, unsigned long) @ 0x00000000102f5c09
10. ./build/../src/Columns/ColumnTuple.cpp:370: DB::ColumnTuple::insertRangeFrom(DB::IColumn const&, unsigned long, unsigned long) @ 0x000000001031e8a0
11. ./build/../src/Columns/ColumnArray.cpp:605: DB::ColumnArray::insertRangeFrom(DB::IColumn const&, unsigned long, unsigned long) @ 0x0000000010195cb7
12. ./build/../utils/extern-local-engine/Storages/IO/NativeReader.cpp:150: local_engine::readNormalComplexData(DB::ReadBuffer&, COW<DB::IColumn>::immutable_ptr<DB::IColumn>&, unsigned long, local_engine::NativeReader::ColumnParseUtil&) @ 0x0000000006e5e0d5
13. ../contrib/llvm-project/libcxx/include/__functional/function.h:848: ? @ 0x0000000006e5d8f0
14. ./build/../utils/extern-local-engine/Storages/IO/NativeReader.cpp:71: local_engine::NativeReader::read() @ 0x0000000006e5be89
15. ./build/../utils/extern-local-engine/Shuffle/ShuffleReader.cpp:51: local_engine::ShuffleReader::read() @ 0x0000000006f47142
16. ./build/../utils/extern-local-engine/local_engine_jni.cpp:554: Java_org_apache_gluten_vectorized_CHStreamReader_nativeNext @ 0x00000000068b61d7

The total memory limit is 1.5G, and there are multiple joins in the plan. 0x7f41e5ad8840 and 0x7f41ebf6c340 is the adresses of GraceHashJoin. each uses over 700M memory.

lgbo-ustc commented 2 days ago

This could extend to more cases

The fixed bytes limit cannot let join spill appropriately in all of these cases.

By the way, aggregation spills adaptively at present.

lgbo-ustc commented 2 days ago

Reimpleting a GraceHashJoin that can spill adaptively in gluten is not a good choice. We need to find a bad case in CH that we can convince CH to accept this modification.