apache / drill

Apache Drill is a distributed MPP query layer for self describing data
https://drill.apache.org/
Apache License 2.0
1.92k stars 985 forks source link

mergejoin memory leak when Depleting incoming batches throw exception #2876

Closed shfshihuafeng closed 3 months ago

shfshihuafeng commented 5 months ago

Before submitting a bug report, please verify that you are using the most current version of Drill.

Describe the bug mergejoin memory leak when Depleting incoming batches throw exception. because we could close rightIterator when leftIterator throw exception To Reproduce Steps to reproduce the behavior:

  1. prepare data for tpch 1s
  2. 20 concurrent for tpch sql8
  3. set direct memory 5g
  4. when it had OutOfMemoryException , stopped all sql. 5.finding memory leak

Expected behavior when all sql sop , we should find direct memory is 0 AND could not find leak log like following.

Allocator(op:2:0:11:MergeJoinPOP) 1000000/73728/4874240/10000000000 (res/actual/peak/limit)

Error detail, log output or screenshots Unable to allocate buffer of size XX (rounded from XX) due to memory limit (). Current allocation: xx

Drill version The version of Drill you encountered the issue in.

Additional context // code placeholder select o_year, sum(case when nation = 'CHINA' then volume else 0 end) / sum(volume) as mkt_share from ( select extract(year from o_orderdate) as o_year, l_extendedprice * 1.0 as volume, n2.n_name as nation from hive.tpch1s.part, hive.tpch1s.supplier, hive.tpch1s.lineitem, hive.tpch1s.orders, hive.tpch1s.customer, hive.tpch1s.nation n1, hive.tpch1s.nation n2, hive.tpch1s.region where p_partkey = l_partkey and s_suppkey = l_suppkey and l_orderkey = o_orderkey and o_custkey = c_custkey and c_nationkey = n1.n_nationkey and n1.n_regionkey = r_regionkey and r_name = 'ASIA' and s_nationkey = n2.n_nationkey and o_orderdate between date '1995-01-01' and date '1996-12-31' and p_type = 'LARGE BRUSHED BRASS') as all_nations group by o_year order by o_year

shfshihuafeng commented 5 months ago

i fixed it see attachment 0001-mergejoin-leak.patch

shfshihuafeng commented 5 months ago

I think it's the same reason https://github.com/apache/drill/issues/2871

shfshihuafeng commented 5 months ago

scenario is reliably repeated by above test ,i fixed 。i can not find leak