prestodb / presto

The official home of the Presto distributed SQL query engine for big data
http://prestodb.io
Apache License 2.0
15.98k stars 5.36k forks source link

MoR _ro table vs _rt table #21033

Closed ChandrasekharPo-Kore closed 1 year ago

ChandrasekharPo-Kore commented 1 year ago

We have a MOR table and ran a compaction. At this state the table does not have delta files. When i query table_ro and table_mr, table_mr always takes 4x times longer than _ro. When inspected the logs, we see the hadooprealtime parquet reader makes multiple round trips for each parquet file. These are all range requests for specific bytes. We saw a total of ~5-6 get requests per parquet file

After compaction, the _ro and _rt queries' performance should not deviate much. Is this a right understanding ?

Hudi : 0.12.3 Prestodb: 283 dfs : s3 metastore : hive

ChandrasekharPo-Kore commented 1 year ago

As we can see there is a~5 seconds interval between "Creating record reader" and "Enabling merged reading of realtime records for split" When enabled debugging we see ~5-6 s3 calls with byte range requests.

2023-10-04T03:17:53.649Z INFO 20231004_031748_00024_mrwps.3.0.1.0-55-67 org.apache.hudi.hadoop.realtime.HoodieParquetRealtimeInputFormat Creating record reader with readCols :botid, Ids :18

2023-10-04T03:17:58.673Z INFO 20231004_031748_00024_mrwps.3.0.1.0-55-67 org.apache.hudi.hadoop.realtime.HoodieRealtimeRecordReader Enabling merged reading of realtime records for split

Logs ::::

2023-10-04T03:17:53.649Z INFO 20231004_031748_00024_mrwps.3.0.1.0-55-67 org.apache.hudi.hadoop.realtime.HoodieParquetRealtimeInputFormat Before adding Hoodie columns, Projections :botid, Ids :18 2023-10-04T03:17:53.649Z INFO 20231004_031748_00024_mrwps.3.0.1.0-55-67 org.apache.hudi.hadoop.realtime.HoodieParquetRealtimeInputFormat Creating record reader with readCols :botid, Ids :18

2023-10-04T03:17:58.673Z INFO 20231004_031748_00024_mrwps.3.0.1.0-55-67 org.apache.hudi.hadoop.realtime.HoodieRealtimeRecordReader Enabling merged reading of realtime records for split HoodieRealtimeFileSplit{DataPath=s3://xxxxxxxxxxxxxxxx/hudi/messagestores_onlyparquet/2023-07-06/dc0e29a4-c846-47b9-bafe-712e3c452695-0_81-46-1202_20231003150054652.parquet, deltaLogPaths=[], maxCommitTime='20231003150054652', basePath='s3://xxxxxxxxxxxxxxxxxxx/hudi/messagestores_onlyparquet'} 2023-10-04T03:17:58.673Z INFO 20231004_031748_00024_mrwps.3.0.1.0-55-67 org.apache.hudi.hadoop.realtime.AbstractRealtimeRecordReader cfg ==> botid 2023-10-04T03:17:58.673Z INFO 20231004_031748_00024_mrwps.3.0.1.0-55-67 org.apache.hudi.hadoop.realtime.AbstractRealtimeRecordReader columnIds ==> 18 2023-10-04T03:17:58.673Z INFO 20231004_031748_00024_mrwps.3.0.1.0-55-67 org.apache.hudi.hadoop.realtime.AbstractRealtimeRecordReader partitioningColumns ==> date 2023-10-04T03:17:58.673Z INFO 20231004_031748_00024_mrwps.3.0.1.0-55-67 org.apache.hudi.common.table.HoodieTableMetaClient Loading HoodieTableMetaClient from s3://xxxxxxxxxxxxxxxxxxx/hudi/messagestores_onlyparquet 2023-10-04T03:17:59.649Z INFO 20231004_031748_00024_mrwps.3.0.1.0-55-67 org.apache.hudi.common.table.HoodieTableConfig Loading table properties from s3://xxxxxxxxxxxxxxxxxxx/hudi/messagestores_onlyparquet/.hoodie/hoodie.properties 2023-10-04T03:17:59.846Z INFO 20231004_031748_00024_mrwps.3.0.1.0-55-67 org.apache.hudi.common.table.HoodieTableMetaClient Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from s3://xxxxxxxxxxxxxxxxxxxx/hudi/messagestores_onlyparquet 2023-10-04T03:17:59.846Z INFO 20231004_031748_00024_mrwps.3.0.1.0-55-67 org.apache.hudi.hadoop.realtime.AbstractRealtimeRecordReader usesCustomPayload ==> true 2023-10-04T03:17:59.846Z INFO 20231004_031748_00024_mrwps.3.0.1.0-55-67 org.apache.hudi.hadoop.realtime.AbstractRealtimeRecordReader Getting writer schema from table avro schema 2023-10-04T03:18:00.043Z INFO 20231004_031748_00024_mrwps.3.0.1.0-55-67 org.apache.hudi.common.table.timeline.HoodieActiveTimeline Loaded instants upto : Option{val=[20231003150054652deltacommitCOMPLETED]} 2023-10-04T03:18:01.181Z INFO 20231004_031748_00024_mrwps.3.0.1.0-55-67 org.apache.hudi.common.table.TableSchemaResolver Reading schema from s3://xxxxxxxxxxxxxxxxxxx/hudi/messagestores_onlyparquet/2023-07-05/12ff3541-be62-41c8-a570-20e555d373ac-0_14-40-1135_20231003150054652.parquet 2023-10-04T03:18:02.874Z INFO 20231004_031748_00024_mrwps.3.0.1.0-55-67 org.apache.hudi.hadoop.realtime.AbstractRealtimeRecordReader Hive Columns : _hoodie_commit_time,_hoodie_commit_seqno,_hoodie_record_key,_hoodie_partition_path,_hoodie_file_name,accountid,chnl,timestampvalue,type,orgid,sessionid,isbb,ms,isd,createdon,st,createdby,_id,botid,lang,tr1_t,tr0_o,tr0_i,tr0_t,tr1_o,eod,path,tn,tr_isss,nodetype,tr1_i,tr_pid,cluster_id,tr1_path,tr1_eod,tr1_nodetype,tr_errnid,tr_pid_path 2023-10-04T03:18:02.874Z INFO 20231004_031748_00024_mrwps.3.0.1.0-55-67 org.apache.hudi.hadoop.realtime.AbstractRealtimeRecordReader Hive Columns : _hoodie_commit_time,_hoodie_commit_seqno,_hoodie_record_key,_hoodie_partition_path,_hoodie_file_name,accountid,chnl,timestampvalue,type,orgid,sessionid,isbb,ms,isd,createdon,st,createdby,_id,botid,lang,tr1_t,tr0_o,tr0_i,tr0_t,tr1_o,eod,path,tn,tr_isss,nodetype,tr1_i,tr_pid,cluster_id,tr1_path,tr1_eod,tr1_nodetype,tr_errnid,tr_pid_path 2023-10-04T03:18:02.874Z INFO 20231004_031748_00024_mrwps.3.0.1.0-55-67 org.apache.hudi.hadoop.realtime.AbstractRealtimeRecordReader About to read compacted logs [] for base split s3://xxxxxxxxxxxxxxxxxxxxxx/hudi/messagestores_onlyparquet/2023-07-06/dc0e29a4-c846-47b9-bafe-712e3c452695-0_81-46-1202_20231003150054652.parquet, projecting cols [botid] 2023-10-04T03:18:02.874Z INFO 20231004_031748_00024_mrwps.3.0.1.0-55-67 org.apache.hudi.common.table.HoodieTableMetaClient Loading HoodieTableMetaClient from s3://xxxxxxxxxxxxxxxxxxx/hudi/messagestores_onlyparquet 2023-10-04T03:18:03.839Z INFO 20231004_031748_00024_mrwps.3.0.1.0-55-67 org.apache.hudi.common.table.HoodieTableConfig Loading table properties from s3://xxxxxxxxxxxxxxxxxxx/hudi/messagestores_onlyparquet/.hoodie/hoodie.properties 2023-10-04T03:18:04.033Z INFO 20231004_031748_00024_mrwps.3.0.1.0-55-67 org.apache.hudi.common.table.HoodieTableMetaClient Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from s3://xxxxxxxxxxxxxxxxxxx/hudi/messagestores_onlyparquet 2023-10-04T03:18:04.232Z INFO 20231004_031748_00024_mrwps.3.0.1.0-55-67 org.apache.hudi.common.table.timeline.HoodieActiveTimeline Loaded instants upto : Option{val=[20231003150054652deltacommitCOMPLETED]} 2023-10-04T03:18:04.233Z INFO 20231004_031748_00024_mrwps.3.0.1.0-55-67 org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner Number of log files scanned => 0 2023-10-04T03:18:04.233Z INFO 20231004_031748_00024_mrwps.3.0.1.0-55-67 org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner MaxMemoryInBytes allowed for compaction => 805306368 2023-10-04T03:18:04.233Z INFO 20231004_031748_00024_mrwps.3.0.1.0-55-67 org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner Number of entries in MemoryBasedMap in ExternalSpillableMap => 0 2023-10-04T03:18:04.233Z INFO 20231004_031748_00024_mrwps.3.0.1.0-55-67 org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner Total size in bytes of MemoryBasedMap in ExternalSpillableMap => 0 2023-10-04T03:18:04.233Z INFO 20231004_031748_00024_mrwps.3.0.1.0-55-67 org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner Number of entries in BitCaskDiskMap in ExternalSpillableMap => 0 2023-10-04T03:18:04.233Z INFO 20231004_031748_00024_mrwps.3.0.1.0-55-67 org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner Size of file spilled to disk => 0 [kore@ip-172-31-15-1 log]$

tdcmeehan commented 1 year ago

CC: @pratyakshsharma

pratyakshsharma commented 1 year ago

@ChandrasekharPo-Kore I hope you are referring to real-time table with table_mr name? And yes the performance should not deviate much after compaction for a MoR table.

Are you trying to compare the performance of _rt and _ro tables or MoR and CoW tables? Please confirm.

ChandrasekharPo-Kore commented 1 year ago

@pratyakshsharma I hope you are referring to real-time table with table_mr name? - yes Are you trying to compare the performance of _rt and _ro tables or MoR and CoW tables? Please confirm - yes

pratyakshsharma commented 1 year ago

Are you trying to compare the performance of _rt and _ro tables or MoR and CoW tables? Please confirm - yes

Sorry the reply was not clear, I assume you are trying to do both of the above mentioned comparisons. This needs to be looked into. Let me circle back on this. cc @codope if you already have some context on this, please let us know.

ChandrasekharPo-Kore commented 1 year ago

@pratyakshsharma Sorry. Are you trying to compare the performance of _rt and _ro tables or MoR and CoW tables? Please confirm . We are trying to compare the performance of _rt and _ro tables.

ChandrasekharPo-Kore commented 1 year ago

This is resolved. The latency was due to the cross-region calls.S3 bucket was in other region ( ~200 ms latency ) Hence closing this one.