numberlabs-developers / hudi

Upserts, Deletes And Incremental Processing on Big Data.
https://hudi.apache.org/
Apache License 2.0
0 stars 0 forks source link

[SUPPORT] Handling multiple DeltaStreamers writing to the same target table #248

Open torvalds-dev-testbot[bot] opened 3 months ago

torvalds-dev-testbot[bot] commented 3 months ago

Tips before filing an issue

Describe the problem you faced A clear and concise description of the problem.

To Reproduce Steps to reproduce the behavior: 1. 2. 3. 4.

Expected behavior A clear and concise description of what you expected to happen.

Environment Description

Additional context Add any other context about the problem here.

Stacktrace Add the stacktrace of the error.

torvalds-dev-testbot[bot] commented 3 months ago

Relevant Sources:

https://medium.com/@simpsons/multi-writer-support-with-apache-hudi-e1b75dca29e6

https://hudi.apache.org/docs/faq_writing_tables

https://hudi.apache.org/docs/concurrency_control

https://github.com/apache/hudi/issues/10246

torvalds-dev-testbot[bot] commented 3 months ago

Hello! 👋 Please wait while I search the knowledge base for an answer; this usually takes about 30 seconds

torvalds-dev-testbot[bot] commented 3 months ago

Follow-up questions:

  1. Can you describe the version of Hudi and other tech stack in your environment? If you are facing an error, please provide a full stack trace and describe what operations led to the errors.
  2. What is the indexing scheme you are using, and what table type are you utilizing for Hudi?
  3. Are there any specific configurations you have set for the Hudi deltastreamers writing to the same target table?
  4. How do you manage your data writer processes for the same Hudi table?
  5. Could you share your Hudi timeline to understand the sequence of operations by multiple deltastreamers writing to the table?

Sources:

torvalds-dev-testbot[bot] commented 3 months ago

Follow-up questions:

  1. Can you describe the version of Hudi and other tech stack in your environment? If you are facing an error, please provide a full stack trace and describe what operations led to the errors.
  2. What is the indexing scheme you are using, and what table type are you utilizing for Hudi?
  3. Are there any specific configurations you have set for the Hudi deltastreamers writing to the same target table?
  4. How do you manage your data writer processes for the same Hudi table?
  5. Could you share your Hudi timeline to understand the sequence of operations by multiple deltastreamers writing to the table?

Sources:

torvalds-dev-testbot[bot] commented 3 months ago

Hello! 👋 Please wait while I generate the follow up questions; this usually takes about 30 seconds

torvalds-dev-testbot[bot] commented 3 months ago

Sources:

  1. https://medium.com/@simpsons/multi-writer-support-with-apache-hudi-e1b75dca29e6
  2. https://hudi.apache.org/docs/concurrency_control
torvalds-dev-testbot[bot] commented 3 months ago

Sources:

  1. https://medium.com/@simpsons/multi-writer-support-with-apache-hudi-e1b75dca29e6
  2. https://hudi.apache.org/docs/concurrency_control
torvalds-dev-testbot[bot] commented 3 months ago

Ticket created successfully. Here is the link to the GitHub issue: https://github.com/numberlabs-developers/hudi/issues/248

torvalds-dev-testbot[bot] commented 3 months ago

Hello! :wave: Please wait while I generate the follow up questions; this usually takes about 30 seconds

torvalds-dev-testbot[bot] commented 3 months ago

Based on the documentation snippets:

Based on the above information, if two DeltaStreamers are running for the same target table and attempt to concurrently write to overlapping file groups, one of them will succeed while the other will fail due to conflict resolution mechanisms in place. Hence, this won't result in table corruption, but instead, conflicting operations will be handled, ensuring the integrity of the data being written. :large_blue_square: Confidence Score: 93.46% :loudspeaker: Publish Answer button >torvalds.dev is learning and improving. React below to provide feedback! :+1: 0 button :-1: 0 button Submit button, with interactive elements