Open torvalds-dev-testbot[bot] opened 3 months ago
Hello! 👋 Please wait while I search the knowledge base for an answer; this usually takes about 30 seconds
The structure you've provided seems suitable for an Apache Hudi table design with multiple possible queries. To proceed with this design, you can consider configuring the Hudi table using record keys and defining partition fields. It's important to understand the implications of defining record keys and how they impact write operations. Additionally, focusing on partition configuration within the Hudi table can greatly enhance read and write throughput efficiency, especially for petabyte-scale tables. Taking these aspects into account can help you optimize the performance of your table as you run the queries you've outlined.
Follow up questions:
Sources:
Ticket created successfully. Here is the link to the GitHub issue: https://github.com/numberlabs-developers/hudi/issues/228
Ticket created successfully. Here is the link to the GitHub issue: https://github.com/numberlabs-developers/hudi/issues/228
Describe the problem you faced I am a new user to Hudi and Parquet, and I have a table design question. I have structured my table in the following way: