numberlabs-developers / hudi

Upserts, Deletes And Incremental Processing on Big Data.
https://hudi.apache.org/
Apache License 2.0
0 stars 0 forks source link

2023-12-13 00:27:07,476 INFO [Executor task launch worker for task 577 #199

Open torvalds-dev-testbot[bot] opened 10 months ago

torvalds-dev-testbot[bot] commented 10 months ago

2023-12-13 00:27:07,477 INFO [Executor task launch worker for task 577.0 in stage 45181.0 (TID 16233208)] org.apache.hudi.common.util.collection.ExternalSpillableMap:Estimated Payload size => 2504
2023-12-13 00:27:07,478 INFO [Executor task launch worker for task 577.0 in stage 45181.0 (TID 16233208)] org.apache.hudi.common.util.collection.ExternalSpillableMap:New Estimated Payload size => 2845
2023-12-13 00:27:09,814 INFO [producer-thread-1] org.apache.hudi.common.util.queue.IteratorBasedQueueProducer:starting to buffer records
2023-12-13 00:27:09,821 INFO [consumer-thread-1] org.apache.hudi.common.util.queue.BoundedInMemoryExecutor:starting consumer thread
2023-12-13 00:27:09,855 INFO [producer-thread-1] org.apache.hudi.common.util.queue.IteratorBasedQueueProducer:starting to buffer records
2023-12-13 00:27:09,918 INFO [consumer-thread-1] org.apache.hudi.common.util.queue.BoundedInMemoryExecutor:starting consumer thread
2023-12-13 00:27:19,120 INFO [Executor task launch worker for task 577.0 in stage 45181.0 (TID 16233208)] org.apache.hudi.io.HoodieMergeHandle:Number of entries in MemoryBasedMap => 920285, Total size in bytes of MemoryBasedMap => 2618210917, Number of entries in BitCaskDiskMap => 0, Size of file spilled to disk => 0
2023-12-13 00:27:19,120 INFO [Executor task launch worker for task 577.0 in stage 45181.0 (TID 16233208)] org.apache.hudi.io.HoodieMergeHandle:partitionPath:tenant=aaaaaa/date=20231213, fileId to be merged:3d4538da-9810-445e-84ef-63b03719092b-0
2023-12-13 00:27:19,134 INFO [Executor task launch worker for task 577.0 in stage 45181.0 (TID 16233208)] org.apache.hudi.io.HoodieMergeHandle:Merging new data into oldPath <s3://some-s3-bucket/hudi/visibility=private/schema=scwx.process/tenant=aaaaaa/date=20231213/3d4538da-9810-445e-84ef-63b03719092b-0_616-45168-16228077_20231213002302278.parquet>, as newPath <s3://some-s3-bucket/hudi/visibility=private/schema=scwx.process/tenant=aaaaaa/date=20231213/3d4538da-9810-445e-84ef-63b03719092b-0_577-45181-16233208_20231213002634231.parquet>
2023-12-13 00:27:19,326 INFO [producer-thread-1] org.apache.hudi.common.util.queue.IteratorBasedQueueProducer:finished buffering records
2023-12-13 00:27:19,330 INFO [consumer-thread-1] org.apache.hudi.common.util.queue.BoundedInMemoryExecutor:Queue Consumption is done; notifying producer threads
2023-12-13 00:27:19,457 INFO [Executor task launch worker for task 577.0 in stage 45181.0 (TID 16233208)] org.apache.hudi.table.marker.DirectWriteMarkers:Creating Marker Path=<s3://some-s3-bucket/hudi/visibility=private/schema=scwx.process/.hoodie/.temp/20231213002634231/tenant=aaaaaa/date=20231213/3d4538da-9810-445e-84ef-63b03719092b-0_577-45181-16233208_20231213002634231.parquet.marker.MERGE>
2023-12-13 00:27:19,524 INFO [Executor task launch worker for task 577.0 in stage 45181.0 (TID 16233208)] org.apache.hudi.table.marker.DirectWriteMarkers:[direct] Created marker file <s3://some-s3-bucket/hudi/visibility=private/schema=scwx.process/.hoodie/.temp/20231213002634231/tenant=aaaaaa/date=20231213/3d4538da-9810-445e-84ef-63b03719092b-0_577-45181-16233208_20231213002634231.parquet.marker.MERGE> in 389 ms
2023-12-13 00:27:19,615 INFO [producer-thread-1] org.apache.hudi.common.util.queue.IteratorBasedQueueProducer:starting to buffer records
2023-12-13 00:27:19,616 INFO [consumer-thread-1] org.apache.hudi.common.util.queue.BoundedInMemoryExecutor:starting consumer thread
2023-12-13 00:27:25,880 INFO [producer-thread-1] org.apache.hudi.common.util.queue.IteratorBasedQueueProducer:finished buffering records
2023-12-13 00:27:25,880 INFO [consumer-thread-1] org.apache.hudi.common.util.queue.BoundedInMemoryExecutor:Queue Consumption is done; notifying producer threads
2023-12-13 00:27:32,378 INFO [producer-thread-1] org.apache.hudi.common.util.queue.IteratorBasedQueueProducer:finished buffering records
2023-12-13 00:27:32,381 INFO [consumer-thread-1] org.apache.hudi.common.util.queue.BoundedInMemoryExecutor:Queue Consumption is done; notifying producer threads
2023-12-13 00:29:13,455 INFO [Executor task launch worker for task 577.0 in stage 45181.0 (TID 16233208)] org.apache.hudi.io.HoodieMergeHandle:MergeHandle for partitionPath tenant=aaaaaa/date=20231213 fileID 3d4538da-9810-445e-84ef-63b03719092b-0, took 125979 ms.```
torvalds-dev[bot] commented 10 months ago

torvalds.dev is analyzing the ticket

torvalds-dev[bot] commented 10 months ago

The issue seems to be related to the Apache Hudi framework, which is used for managing storage of large analytical datasets over DFS (Distributed File Systems). The logs provided are from a Hudi job that is performing a merge operation on a dataset.

Here's a brief summary of the logs:

The relevant files to this issue are:

  1. IOUtils.java: This file contains the IOUtils class which has methods for calculating the maximum memory allowed for merge and compaction tasks. It seems to be handling the memory allocation for the merge operation in the logs.

  2. TestSparkIOUtils.java: This file contains the TestSparkIOUtils class which tests the methods in the IOUtils class. It might be useful for understanding how the memory allocation methods are expected to work.

Suggested actions:

  1. Check the memory allocation for the executor running the task. The logs show that the maximum memory per partition merge is set to a large value. If the executor does not have enough memory, it could cause issues.

  2. Verify the payload size estimates. The logs show that the estimated payload size is updated shortly after the task starts. If these estimates are incorrect, it could lead to problems with memory allocation.

  3. Monitor the producer and consumer threads. The logs show that these threads start and finish buffering and consuming records multiple times during the task. If these threads are not working correctly, it could affect the merge operation.

  4. Review the merge operation. The logs show that the task is merging new data into an existing file. If there are issues with the data or the merge operation itself, it could cause the task to fail or perform poorly.

  5. Check the creation of the marker file. The logs show that a marker file is created for the merge operation. If there are issues with the creation or use of this marker file, it could affect the merge operation.