Open torvalds-dev[bot] opened 3 weeks ago
Relevant Sources:
https://api.github.com/repos/apache/hudi/issues/10228
https://api.github.com/repos/apache/hudi/issues/10203
https://apache-hudi.slack.com/archives/C4D716NPQ/p1713562820086629
https://apache-hudi.slack.com/archives/C4D716NPQ/p1713166725381159
https://api.github.com/repos/apache/hudi/issues/6297 >torvalds.dev is learning and improving. React below to provide feedback! π 0 button π 0 button
Hello! π Please wait while I search the knowledge base for an answer; this usually takes about 30 seconds
It seems like the performance issue you are facing with query table performance could be related to incremental queries in Apache Hudi. The documentation from http://onehouse.ai|onehouse.ai blog on incremental read patterns mentions that special handling is required for incremental queries since Hudi uses instant time based on the start of the transaction. This could lead to instances where incremental queries do not serve data from in-flight instant or beyond, ensuring no data loss or dropped records. Additionally, the documentation highlights that Hudi focuses on safety over liveness, ensuring data integrity during incremental reads. It explains how Hudi handles scenarios like errors to maintain consistency and safety in data processing. This detailed explanation of incremental read patterns in Apache Hudi could shed light on the performance degradation you are experiencing in query table performance.
Source: https://www.onehouse.ai/blog/on-iceberg-and-hudi-acid-guarantees :large_blue_square: Confidence Score: 90.66% >torvalds.dev is learning and improving. React below to provide feedback! π 0 button π 0 button
:loudspeaker: below documentation from http://onehouse.ai blog on incremental read patterns mentions that special handling is required for incremental queries since Hudi uses instant time based on the start of the transaction. This could lead to instances where incremental queries do not serve data from in-flight instant or beyond, ensuring no data loss or dropped records. Additionally, the documentation highlights that Hudi focuses on safety over liveness, ensuring data integrity during incremental reads. It explains how Hudi handles scenarios like errors to maintain consistency and safety in data processing. This detailed explanation of incremental read patterns in Apache Hudi could shed light on the performance degradation you are experiencing in query table performance.
Source: https://www.onehouse.ai/blog/on-iceberg-and-hudi-acid-guarantees
:lock: this is a private message
It seems like the performance issue you are facing with query table performance could be related to incremental queries in Apache Hudi. The documentation from http://onehouse.ai blog on incremental read patterns mentions that special handling is required for incremental queries since
Describe the problem you faced Torvalds.βDev reported experiencing degradation in query table performance and requests assistance.
Additional context
Environment Description Please provide the following information:
Stacktrace No stacktrace provided.