Open zyclove opened 1 year ago
@zyclove Can you post the timeline when this error occurred and table/writer configs.
After adjusting the driver memory from 16G to 32G, the run was completed, and the parameters were added --conf spark.network.timeout=6000s --conf spark.executor.heartbeatInterval=6000s;
why does the driver consume so much memory?
@ad1happy2go
@zyclove What are your instant sizes? Hudi tries to load instant in the driver, so if instant sizes are too large it may fail with OOO exception. You can also check if your table contains too many small files. Share us the writer configuration and I can also check. Thanks.
@ad1happy2go
Thank you very much for your reply despite your busy schedule. After looking at the output results, there are indeed a lot of small files. How should we solve this situation now? I couldn't run anymore. The table creation statements and execution scripts are as follows.
spark-sql> show create table bi_ods_real.smart_datapoint_report_rw_clear_rt;
CREATE TABLE `bi_ods_real`.`smart_datapoint_report_rw_clear_rt` (
`_hoodie_commit_time` STRING,
`_hoodie_commit_seqno` STRING,
`_hoodie_record_key` STRING,
`_hoodie_partition_path` STRING,
`_hoodie_file_name` STRING,
`id` STRING COMMENT 'id',
`uuid` STRING COMMENT 'log uuid',
`data_id` STRING,
`dev_id` STRING COMMENT 'id',
`gw_id` STRING,
`product_id` STRING,
`uid` STRING COMMENT '用户ID',
`dp_code` STRING,
`dp_id` STRING COMMENT 'dp点',
`gmtModified` STRING,
`dp_name` STRING,
`dp_time` STRING,
`dp_type` STRING,
`dp_value` STRING,
`gmt_modified` BIGINT COMMENT 'ct 时间',
`dt` STRING COMMENT '时间分区字段',
`dp_mode` STRING)
USING hudi
PARTITIONED BY (dt, dp_mode)
COMMENT ''
TBLPROPERTIES (
'hoodie.bucket.index.num.buckets' = '50',
'primaryKey' = 'id',
'last_commit_time_sync' = '20231021185003298',
'hoodie.common.spillable.diskmap.type' = 'ROCKS_DB',
'hoodie.combine.before.upsert' = 'false',
'hoodie.compact.inline' = 'false',
'type' = 'mor',
'preCombineField' = 'gmt_modified',
'hoodie.datasource.write.partitionpath.field' = 'dt,dp_mode')
insert into bi_ods_real.smart_datapoint_report_rw_clear_rt
select
md5(concat(coalesce(data_id,''),coalesce(dev_id,''),coalesce(gw_id,''),coalesce(product_id,''),coalesce(uid,''),coalesce(dp_code,''),coalesce(dp_id,''),coalesce(gmtModified,''),if(dp_mode in ('ro','rw','wr'),dp_mode,'un'),coalesce(dp_name,''),coalesce(dp_time,''),coalesce(dp_type,''),coalesce(dp_value,''),coalesce(ct,''))) as id,
_hoodie_record_key as uuid,
data_id,dev_id,gw_id,product_id,uid,
dp_code,dp_id,gmtModified,if(dp_mode in ('ro','rw','wr'),dp_mode,'un') as dp_mode ,dp_name,dp_time,dp_type,dp_value,
ct as gmt_modified,
case
when length(ct)=10 then date_format(from_unixtime(ct),'yyyyMMddHH')
when length(ct)=13 then date_format(from_unixtime(ct/1000),'yyyyMMddHH')
else '1970010100' end as dt
from
hudi_table_changes('bi_ods_real.ods_log_smart_datapoint_report_batch_rt', 'latest_state', '20231114033500000', '20231114040500000')
lateral view dataPointExplode(split(value,'\001')[0]) dps as ct, data_id, dev_id, gw_id, product_id, uid, dp_code, dp_id, gmtModified, dp_mode, dp_name, dp_time, dp_type, dp_value
where _hoodie_commit_time >20231114033500000 and _hoodie_commit_time<=20231114040500000
The driver memory
Please help me find out how to solve it, thank you very much.
@ad1happy2go @codope Hi, could you please help me look into this problem and give you a solution? Thanks.
@zyclove You can run clustering to merge this small files.
Also can you let us know the timeline instant sizes under ./hoodie/
@ad1happy2go The timeline instant size is not big, but 910 files. Other files are kb level. In theory, there are no such good resources for such a small file.
so how to run clustering to merge this small files. Can you provide a sample command?
@zyclove Refer - https://hudi.apache.org/docs/next/clustering/
hey, are we good to close out the ticket?
Tips before filing an issue
Why is the task suddenly out of memory? How to solve it?
Describe the problem you faced
A clear and concise description of the problem.
To Reproduce
Steps to reproduce the behavior:
1.The task suddenly out of memory, add spark executor and driver memory is not work.
Expected behavior
A clear and concise description of what you expected to happen.
Environment Description
Hudi version :0.14.0
Spark version :3.2.1
Hive version :3.1.3
Hadoop version :3.2.2
Storage (HDFS/S3/GCS..) :cos
Running on Docker? (yes/no) :no
Additional context
Add any other context about the problem here.
Stacktrace