apache / hudi

Upserts, Deletes And Incremental Processing on Big Data.
https://hudi.apache.org/
Apache License 2.0
5.47k stars 2.43k forks source link

[SUPPORT] Trino queries failing when hudi.metadata_enabled is set to true. #9758

Open BalaMahesh opened 1 year ago

BalaMahesh commented 1 year ago

Tips before filing an issue

Describe the problem you faced

We have been using hudi 0.12.0 version and observed that our GCS listing and usage metrics have went high. When we brought this up in slack community, they have suggested to update to 0.13.1, we have updated our apps to 0.13.1 version and able to query the tables using hudi catalog from Trino. But when I tried setting the below property to leverage metadata to improve the query performance, we are unable to run the queries and seeing the errors.
hudi.metadata_enabled=true;

To Reproduce

Steps to reproduce the behavior:

  1. Enable metadata table in hudi 0.13.1
  2. Set hudi.metadata_enabled=true in hudi catalog on Trino.
  3. Now run any simple query to read data from hudi table.
  4. We will see the error.

Expected behavior

A clear and concise description of what you expected to happen.

Trino leverage the metadata and run the queries faster.

Environment Description

Additional context

Add any other context about the problem here.

Stacktrace

io.trino.spi.TrinoException: Failed to generate splits for
    at io.trino.plugin.hudi.HudiSplitSource.lambda$new$1(HudiSplitSource.java:98)
    at io.trino.plugin.hudi.split.HudiBackgroundSplitLoader$1.onFailure(HudiBackgroundSplitLoader.java:111)
    at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1124)
    at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
    at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1270)
    at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1038)
    at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:808)
    at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.afterRanInterruptiblyFailure(TrustedListenableFutureTask.java:141)
    at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:88)
    at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
    at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/util/Bytes
    at org.apache.hudi.common.bootstrap.index.HFileBootstrapIndex.<clinit>(HFileBootstrapIndex.java:94)
    at java.base/java.lang.Class.forName0(Native Method)
    at java.base/java.lang.Class.forName(Class.java:375)
    at org.apache.hudi.common.util.ReflectionUtils.getClass(ReflectionUtils.java:54)
    at org.apache.hudi.common.util.ReflectionUtils.loadClass(ReflectionUtils.java:89)
    at org.apache.hudi.common.bootstrap.index.BootstrapIndex.getBootstrapIndex(BootstrapIndex.java:163)
    at org.apache.hudi.common.table.view.AbstractTableFileSystemView.init(AbstractTableFileSystemView.java:109)
    at org.apache.hudi.common.table.view.HoodieTableFileSystemView.init(HoodieTableFileSystemView.java:108)
    at org.apache.hudi.common.table.view.HoodieTableFileSystemView.<init>(HoodieTableFileSystemView.java:102)
    at org.apache.hudi.common.table.view.HoodieTableFileSystemView.<init>(HoodieTableFileSystemView.java:93)
    at org.apache.hudi.metadata.HoodieTableMetadataUtil.getFileSystemView(HoodieTableMetadataUtil.java:1030)
    at org.apache.hudi.metadata.HoodieTableMetadataUtil.getPartitionFileSlices(HoodieTableMetadataUtil.java:1048)
    at org.apache.hudi.metadata.HoodieTableMetadataUtil.getPartitionLatestMergedFileSlices(HoodieTableMetadataUtil.java:996)
    at org.apache.hudi.metadata.HoodieBackedTableMetadata.getPartitionFileSliceToKeysMapping(HoodieBackedTableMetadata.java:378)
    at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordsByKeys(HoodieBackedTableMetadata.java:207)
    at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordByKey(HoodieBackedTableMetadata.java:142)
    at org.apache.hudi.metadata.BaseTableMetadata.fetchAllFilesInPartition(BaseTableMetadata.java:323)
    at org.apache.hudi.metadata.BaseTableMetadata.getAllFilesInPartition(BaseTableMetadata.java:141)
    at org.apache.hudi.metadata.HoodieMetadataFileSystemView.listPartition(HoodieMetadataFileSystemView.java:65)
    at org.apache.hudi.common.table.view.AbstractTableFileSystemView.lambda$ensurePartitionLoadedCorrectly$9(AbstractTableFileSystemView.java:306)
    at java.base/java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1708)
    at org.apache.hudi.common.table.view.AbstractTableFileSystemView.ensurePartitionLoadedCorrectly(AbstractTableFileSystemView.java:297)
    at org.apache.hudi.common.table.view.AbstractTableFileSystemView.getLatestBaseFiles(AbstractTableFileSystemView.java:497)
    at io.trino.plugin.hudi.query.HudiReadOptimizedDirectoryLister.listStatus(HudiReadOptimizedDirectoryLister.java:104)
    at io.trino.plugin.hudi.split.HudiBackgroundSplitLoader.loadSplits(HudiBackgroundSplitLoader.java:94)
    at io.trino.plugin.hudi.split.HudiBackgroundSplitLoader.lambda$start$0(HudiBackgroundSplitLoader.java:74)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
    at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131)
    at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:74)
    ... 4 more
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.util.Bytes
    at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:445)
    at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:587)
    at io.trino.server.PluginClassLoader.loadClass(PluginClassLoader.java:126)
    at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:520)
    ... 33 more
codope commented 1 year ago

Can you please try upgrading to the latest version of Trino? HBase was not included as compile-time dependency previously. If upgrading to Trino is not an immediate option for you, then you will have place hbase-common jars manually under <TRINO_HOME>plugin/hudi/.

BalaMahesh commented 1 year ago

hey @codope , I had to add bunch of jars along with hbase-common to get the query running after enabling hudi.metadata_enabled=true.

What I have observed is that the query is taking longer time with this property set than the one without using metadata_enabled.

Also I have observed that hudi is creating bunch of log files in metadata folder even after I have set the configurations

hoodie.metadata.enable=true hoodie.write.concurrency.mode=optimistic_concurrency_control hoodie.write.lock.provider=org.apache.hudi.client.transaction.lock.InProcessLockProvider hoodie.metadata.compact.max.delta.commits=1

and also it seems we don't have control over

hoodie.metadata.cleaner.commits.retained - to keep the number of log and base files on metadata folder.

With these many number of files, when we run the query, Trino is trying to run compaction on these log and base files in metadata folder and then reading the parquet files which is adding some delay and latency in query run times.

Is this behaviour expected ? We are assuming that queries will be faster with metadata enabled but what we are seeing is reverse.

There are 640 files lying on the metadata folder.

Screenshot 2023-09-26 at 11 26 47 AM

2023-09-26T11:02:12.915+0530    INFO    Query-20230925_142458_00023_we5zs-956   org.apache.hudi.common.table.HoodieTableMetaClient  Loading HoodieTableMetaClient from gs://bucket/hudi/db/table
2023-09-26T11:02:13.047+0530    INFO    Query-20230925_142458_00023_we5zs-956   org.apache.hudi.common.table.HoodieTableConfig  Loading table properties from gs://bucket/hudi/db/table/.hoodie/hoodie.properties
2023-09-26T11:02:13.482+0530    INFO    Query-20230925_142458_00023_we5zs-956   org.apache.hudi.common.table.HoodieTableMetaClient  Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from gs://bucket/hudi/db/table
2023-09-26T11:02:13.788+0530    INFO    Query-20230925_142458_00023_we5zs-956   org.apache.hudi.common.table.timeline.HoodieActiveTimeline  Loaded instants upto : Option{val=[20230925214447606__deltacommit__COMPLETED]}
2023-09-26T11:02:13.788+0530    INFO    Query-20230925_142458_00023_we5zs-956   org.apache.hudi.common.table.view.FileSystemViewManager Creating InMemory based view for basePath gs://bucket/hudi/db/table
2023-09-26T11:02:13.789+0530    INFO    Query-20230925_142458_00023_we5zs-956   org.apache.hudi.common.table.view.AbstractTableFileSystemView   Took 1 ms to read  0 instants, 0 replaced file groups
2023-09-26T11:02:13.794+0530    INFO    Query-20230925_142458_00023_we5zs-956   org.apache.hudi.common.util.ClusteringUtils Found 0 files in pending clustering operations
2023-09-26T11:02:13.796+0530    INFO    hudi-split-manager-22   org.apache.hudi.common.table.view.AbstractTableFileSystemView   Building file system view for partition ()
2023-09-26T11:02:13.938+0530    INFO    hudi-split-manager-22   org.apache.hudi.common.table.view.AbstractTableFileSystemView   addFilesToView: NumFiles=12, NumFileGroups=1, FileGroupsCreationTime=2, StoreTimeTaken=0
2023-09-26T11:02:15.720+0530    INFO    dispatcher-query-45 io.trino.event.QueryMonitor TIMELINE: Query 20230925_142458_00023_we5zs :: FINISHED :: elapsed 6814ms :: planning 1191ms :: waiting 2037ms :: scheduling 3855ms :: running 1621ms :: finishing 147ms :: begin 2023-09-26T11:02:08.906+05:30 :: end 2023-09-26T11:02:15.720+05:30

query logs with hudi.metadata_enabled=true.

2023-09-26T11:04:12.748+0530    INFO    Query-20230925_142658_00025_we5zs-1005  org.apache.hudi.common.table.HoodieTableMetaClient  Loading HoodieTableMetaClient from gs://bucket/hudi/db/table
2023-09-26T11:04:12.927+0530    INFO    Query-20230925_142658_00025_we5zs-1005  org.apache.hudi.common.table.HoodieTableConfig  Loading table properties from gs://bucket/hudi/db/table/.hoodie/hoodie.properties
2023-09-26T11:04:13.379+0530    INFO    Query-20230925_142658_00025_we5zs-1005  org.apache.hudi.common.table.HoodieTableMetaClient  Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from gs://bucket/hudi/db/table
2023-09-26T11:04:13.626+0530    INFO    Query-20230925_142658_00025_we5zs-1005  org.apache.hudi.common.table.timeline.HoodieActiveTimeline  Loaded instants upto : Option{val=[20230925214447606__deltacommit__COMPLETED]}
2023-09-26T11:04:13.627+0530    INFO    Query-20230925_142658_00025_we5zs-1005  org.apache.hudi.common.table.view.FileSystemViewManager Creating InMemory based view for basePath gs://bucket/hudi/db/table
2023-09-26T11:04:13.627+0530    INFO    Query-20230925_142658_00025_we5zs-1005  org.apache.hudi.common.table.view.AbstractTableFileSystemView   Took 0 ms to read  0 instants, 0 replaced file groups
2023-09-26T11:04:13.628+0530    INFO    Query-20230925_142658_00025_we5zs-1005  org.apache.hudi.common.util.ClusteringUtils Found 0 files in pending clustering operations
2023-09-26T11:04:13.629+0530    INFO    Query-20230925_142658_00025_we5zs-1005  org.apache.hudi.common.table.HoodieTableMetaClient  Loading HoodieTableMetaClient from gs://bucket/hudi/db/table
2023-09-26T11:04:13.814+0530    INFO    Query-20230925_142658_00025_we5zs-1005  org.apache.hudi.common.table.HoodieTableConfig  Loading table properties from gs://bucket/hudi/db/table/.hoodie/hoodie.properties
2023-09-26T11:04:14.273+0530    INFO    Query-20230925_142658_00025_we5zs-1005  org.apache.hudi.common.table.HoodieTableMetaClient  Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from gs://bucket/hudi/db/table
2023-09-26T11:04:14.273+0530    INFO    Query-20230925_142658_00025_we5zs-1005  org.apache.hudi.common.table.HoodieTableMetaClient  Loading HoodieTableMetaClient from gs://bucket/hudi/db/table/.hoodie/metadata
2023-09-26T11:04:14.499+0530    INFO    Query-20230925_142658_00025_we5zs-1005  org.apache.hudi.common.table.HoodieTableConfig  Loading table properties from gs://bucket/hudi/db/table/.hoodie/metadata/.hoodie/hoodie.properties
2023-09-26T11:04:14.969+0530    INFO    Query-20230925_142658_00025_we5zs-1005  org.apache.hudi.common.table.HoodieTableMetaClient  Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from gs://bucket/hudi/db/table/.hoodie/metadata
2023-09-26T11:04:14.974+0530    INFO    hudi-split-manager-25   org.apache.hudi.common.table.view.AbstractTableFileSystemView   Building file system view for partition ()
2023-09-26T11:04:14.976+0530    INFO    hudi-split-manager-25   org.apache.hudi.metadata.HoodieTableMetadataUtil    Loading latest merged file slices for metadata table partition files
2023-09-26T11:04:15.451+0530    INFO    hudi-split-manager-25   org.apache.hudi.common.table.timeline.HoodieActiveTimeline  Loaded instants upto : Option{val=[20230925214447606__deltacommit__COMPLETED]}
2023-09-26T11:04:15.451+0530    INFO    hudi-split-manager-25   org.apache.hudi.common.table.view.AbstractTableFileSystemView   Took 0 ms to read  0 instants, 0 replaced file groups
2023-09-26T11:04:15.599+0530    INFO    hudi-split-manager-25   org.apache.hudi.common.util.ClusteringUtils Found 0 files in pending clustering operations
2023-09-26T11:04:15.600+0530    INFO    hudi-split-manager-25   org.apache.hudi.common.table.view.AbstractTableFileSystemView   Building file system view for partition (files)
2023-09-26T11:04:16.163+0530    INFO    hudi-split-manager-25   org.apache.hudi.common.table.view.AbstractTableFileSystemView   addFilesToView: NumFiles=640, NumFileGroups=1, FileGroupsCreationTime=94, StoreTimeTaken=1
2023-09-26T11:04:16.808+0530    INFO    hudi-split-manager-25   org.apache.hudi.metadata.HoodieBackedTableMetadata  Opened metadata base file from gs://bucket/hudi/db/table/.hoodie/metadata/files/files-0000_0-14014-15896_20230925214227853001.hfile at instant 20230925214227853001 in 638 ms
2023-09-26T11:04:17.016+0530    INFO    hudi-split-manager-25   org.apache.hudi.common.table.timeline.HoodieActiveTimeline  Loaded instants upto : Option{val=[20230925214447606__deltacommit__COMPLETED]}
2023-09-26T11:04:17.023+0530    INFO    hudi-split-manager-25   org.apache.hudi.common.table.HoodieTableMetaClient  Loading HoodieTableMetaClient from gs://bucket/hudi/db/table/.hoodie/metadata
2023-09-26T11:04:17.288+0530    INFO    hudi-split-manager-25   org.apache.hudi.common.table.HoodieTableConfig  Loading table properties from gs://bucket/hudi/db/table/.hoodie/metadata/.hoodie/hoodie.properties
2023-09-26T11:04:17.744+0530    INFO    hudi-split-manager-25   org.apache.hudi.common.table.HoodieTableMetaClient  Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from gs://bucket/hudi/db/table/.hoodie/metadata
2023-09-26T11:04:17.929+0530    INFO    hudi-split-manager-25   org.apache.hudi.common.table.timeline.HoodieActiveTimeline  Loaded instants upto : Option{val=[20230925214447606__deltacommit__COMPLETED]}
2023-09-26T11:04:18.399+0530    INFO    hudi-split-manager-25   org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader  Scanning log file HoodieLogFile{pathStr='gs://bucket/hudi/db/table/.hoodie/metadata/files/.files-0000_20230925214227853001.log.1_0-14021-15903', fileLen=-1}
2023-09-26T11:04:18.400+0530    INFO    hudi-split-manager-25   org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader  Reading a data block from file gs://bucket/hudi/db/table/.hoodie/metadata/files/.files-0000_20230925214227853001.log.1_0-14021-15903 at instant 20230925214447606
2023-09-26T11:04:18.400+0530    INFO    hudi-split-manager-25   org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader  Merging the final data blocks
2023-09-26T11:04:18.400+0530    INFO    hudi-split-manager-25   org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader  Number of remaining logblocks to merge 1
2023-09-26T11:04:18.421+0530    INFO    hudi-split-manager-25   org.apache.hudi.common.util.collection.ExternalSpillableMap Estimated Payload size => 504
2023-09-26T11:04:18.421+0530    INFO    hudi-split-manager-25   org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner   Number of log files scanned => 1
2023-09-26T11:04:18.421+0530    INFO    hudi-split-manager-25   org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner   MaxMemoryInBytes allowed for compaction => 1073741824
2023-09-26T11:04:18.421+0530    INFO    hudi-split-manager-25   org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner   Number of entries in MemoryBasedMap in ExternalSpillableMap => 2
2023-09-26T11:04:18.421+0530    INFO    hudi-split-manager-25   org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner   Total size in bytes of MemoryBasedMap in ExternalSpillableMap => 1008
2023-09-26T11:04:18.421+0530    INFO    hudi-split-manager-25   org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner   Number of entries in BitCaskDiskMap in ExternalSpillableMap => 0
2023-09-26T11:04:18.421+0530    INFO    hudi-split-manager-25   org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner   Size of file spilled to disk => 0
2023-09-26T11:04:18.421+0530    INFO    hudi-split-manager-25   org.apache.hudi.metadata.HoodieBackedTableMetadata  Opened 1 metadata log files (dataset instant=20230925214447606, metadata instant=20230925214447606) in 1609 ms
2023-09-26T11:04:18.428+0530    INFO    hudi-split-manager-25   org.apache.hudi.metadata.BaseTableMetadata  Listed file in partition from metadata: partition=, #files=10
2023-09-26T11:04:18.434+0530    INFO    hudi-split-manager-25   org.apache.hudi.common.table.view.AbstractTableFileSystemView   addFilesToView: NumFiles=10, NumFileGroups=1, FileGroupsCreationTime=6, StoreTimeTaken=0
2023-09-26T11:04:19.704+0530    INFO    dispatcher-query-51 io.trino.event.QueryMonitor TIMELINE: Query 20230925_142658_00025_we5zs :: FINISHED :: elapsed 11021ms :: planning 1219ms :: waiting 2086ms :: scheduling 8620ms :: running 1020ms :: finishing 162ms :: begin 2023-09-26T11:04:08.681+05:30 :: end 2023-09-26T11:04:19.702+05:30
BalaMahesh commented 1 year ago

@codope - any fix for this behaviour ? we started seeing OOM errors on the executor pods because of these large number of metadata files.


Id='HoodieFileGroupId{partitionPath='files', fileId='files-0000'}', metrics={TOTAL_LOG_FILES=671.0, TOTAL_IO_READ_MB=7.0, TOTAL_LOG_FILES_SIZE=7456183.0, TOTAL_IO_WRITE_MB=0.0, TOTAL_IO_MB=7.0}, bootstrapFilePath=Optional.empty}] files
23/09/28 03:55:29 INFO SparkContext: Starting job: collect at HoodieJavaRDD.java:163
23/09/28 03:55:29 INFO DAGScheduler: Got job 30 (collect at HoodieJavaRDD.java:163) with 1 output partitions
23/09/28 03:55:29 INFO DAGScheduler: Final stage: ResultStage 54 (collect at HoodieJavaRDD.java:163)
23/09/28 03:55:29 INFO DAGScheduler: Parents of final stage: List()
23/09/28 03:55:29 INFO DAGScheduler: Missing parents: List()
23/09/28 03:55:29 INFO DAGScheduler: Submitting ResultStage 54 (MapPartitionsRDD[108] at map at HoodieJavaRDD.java:111), which has no missing parents
23/09/28 03:55:29 INFO MemoryStore: Block broadcast_40 stored as values in memory (estimated size 384.6 KiB, free 1048.4 MiB)
23/09/28 03:55:29 INFO MemoryStore: Block broadcast_40_piece0 stored as bytes in memory (estimated size 130.1 KiB, free 1048.3 MiB)
23/09/28 03:55:29 INFO BlockManagerInfo: Added broadcast_40_piece0 in memory on glance-grap-studio-bubble-384cd48ad9e9f6d8-driver-svc.spark-jobs.svc:7079 (size: 130.1 KiB, free: 1048.7 MiB)
23/09/28 03:55:29 INFO SparkContext: Created broadcast 40 from broadcast at DAGScheduler.scala:1478
23/09/28 03:55:29 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 54 (MapPartitionsRDD[108] at map at HoodieJavaRDD.java:111) (first 15 tasks are for partitions Vector(0))
23/09/28 03:55:29 INFO TaskSchedulerImpl: Adding task set 54.0 with 1 tasks resource profile 0
23/09/28 03:55:29 INFO TaskSetManager: Starting task 0.0 in stage 54.0 (TID 226) (10.207.40.187, executor 1, partition 0, PROCESS_LOCAL, 42753 bytes) taskResourceAssignments Map()
23/09/28 03:55:29 INFO BlockManagerInfo: Added broadcast_40_piece0 in memory on 10.207.40.187:34929 (size: 130.1 KiB, free: 7.6 GiB)
23/09/28 03:55:34 INFO HoodieAsyncService: Waiting for next instant up to 10 seconds
23/09/28 03:55:44 INFO HoodieAsyncService: Waiting for next instant up to 10 seconds
23/09/28 03:55:54 INFO HoodieAsyncService: Waiting for next instant up to 10 seconds
23/09/28 03:56:04 INFO HoodieAsyncService: Waiting for next instant up to 10 seconds
23/09/28 03:56:14 INFO HoodieAsyncService: Waiting for next instant up to 10 seconds
23/09/28 03:56:24 INFO HoodieAsyncService: Waiting for next instant up to 10 seconds
23/09/28 03:56:34 INFO HoodieAsyncService: Waiting for next instant up to 10 seconds
23/09/28 03:56:44 INFO HoodieAsyncService: Waiting for next instant up to 10 seconds
23/09/28 03:56:54 INFO HoodieAsyncService: Waiting for next instant up to 10 seconds
23/09/28 03:57:04 INFO HoodieAsyncService: Waiting for next instant up to 10 seconds
23/09/28 03:57:04 WARN TaskSetManager: Lost task 0.0 in stage 54.0 (TID 226) (10.207.40.187 executor 1): java.lang.OutOfMemoryError: Java heap space
    at java.base/java.io.BufferedInputStream.<init>(Unknown Source)
    at org.apache.hadoop.fs.BufferedFSInputStream.<init>(BufferedFSInputStream.java:56)
    at org.apache.hudi.common.table.log.HoodieLogFileReader.getFSDataInputStreamForGCS(HoodieLogFileReader.java:541)
    at org.apache.hudi.common.table.log.HoodieLogFileReader.getFSDataInputStream(HoodieLogFileReader.java:502)
    at org.apache.hudi.common.table.log.HoodieLogFileReader.<init>(HoodieLogFileReader.java:118)
    at org.apache.hudi.common.table.log.HoodieLogFormatReader.hasNext(HoodieLogFormatReader.java:110)
    at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternalV1(AbstractHoodieLogRecordReader.java:247)
    at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:223)
    at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.performScan(HoodieMergedLogRecordScanner.java:198)
    at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:114)
    at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:73)
    at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:464)
    at org.apache.hudi.table.action.compact.HoodieCompactor.compact(HoodieCompactor.java:204)
    at org.apache.hudi.table.action.compact.HoodieCompactor.lambda$compact$9cd4b1be$1(HoodieCompactor.java:129)
    at org.apache.hudi.table.action.compact.HoodieCompactor$$Lambda$2356/0x0000000841269c40.apply(Unknown Source)
    at org.apache.hudi.data.HoodieJavaRDD$$Lambda$1876/0x0000000840ee1040.call(Unknown Source)
    at org.apache.spark.api.java.JavaPairRDD$.$anonfun$toScalaFunction$1(JavaPairRDD.scala:1070)
    at org.apache.spark.api.java.JavaPairRDD$$$Lambda$655/0x000000084060e840.apply(Unknown Source)
    at scala.collection.Iterator$$anon$10.next(Iterator.scala:461)
    at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)
    at org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:223)
    at org.apache.spark.storage.memory.MemoryStore.putIteratorAsBytes(MemoryStore.scala:352)
    at org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1498)
    at org.apache.spark.storage.BlockManager$$Lambda$633/0x00000008405ad840.apply(Unknown Source)
    at org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1408)
    at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1472)
    at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1295)
    at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:384)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:335)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)

23/09/28 03:57:04 INFO TaskSetManager: Starting task 0.1 in stage 54.0 (TID 227) (10.207.40.187, executor 1, partition 0```
codope commented 1 year ago

Do you see any metadata compaction in the metadata timeline (under <table_base_path>/.hoodie/metadata/.hoodie)? It appears as if metadata compaction has not happened and there isn't enough memory to merge the metadata records. We can increase the memory fraction.

ad1happy2go commented 1 year ago

@BalaMahesh Any update on this? Did above info helped?

BalaMahesh commented 1 year ago

@ad1happy2go / @codope -

hoodie.metadata.compact.max.delta.commits=1

with this config I expect compaction to run after every metadata commit, but that is not happening and all the delta commits are being piled up on metadata path and after accumulating large number of files, compaction is being triggered and failing with OOM error.

I didn't get enough time to go through the code flow for this metadata table.

We can increase the memory fraction - if this the problem it should at least trigger the compaction after every delta commit and fail right with hoodie.metadata.compact.max.delta.commits set to 1.

BalaMahesh commented 1 year ago

hoodie.clean.async=false

after setting this to false, compaction is being triggered for the metadata table, earlier there are pendingInstants of delta commits always because async clean kicks in and creates a new delta commit on metadata table timeline which is latest than the one on data timeline.

List<HoodieInstant> pendingInstants = dataMetaClient.reloadActiveTimeline().filterInflightsAndRequested()
        .findInstantsBeforeOrEquals(latestDeltaCommitTimeInMetadataTable).getInstants();

    if (!pendingInstants.isEmpty()) {
      LOG.info(String.format(
          "Cannot compact metadata table as there are %d inflight instants in data table before latest deltacommit in metadata table: %s. Inflight instants in data table: %s",
          pendingInstants.size(), latestDeltaCommitTimeInMetadataTable, Arrays.toString(pendingInstants.toArray())));
      return;
    }

This piece of code in HoodieBackedTableMetadataWriter ,

if (lastCompletedCompactionInstant.isPresent()
        && metadataMetaClient.getActiveTimeline().filterCompletedInstants()
            .findInstantsAfter(lastCompletedCompactionInstant.get().getTimestamp()).countInstants() < 3) {

To clean the metadata files , we should always keep:

hoodie.metadata.compact.max.delta.commits=3

else files will be never cleaned and piled up.

After making these changes in config, I can see cleaner metadata files partition, earlier it was piled up with old files.

codope commented 1 year ago

@danny0405 I see that the the limit was hard-coded to 3 in https://github.com/apache/hudi/commit/f1286c2c764d6be9f23b41c76f4de1c8734c1f3b. Should we determine this based on compaction max delta commits?