numberlabs-developers / hudi

Upserts, Deletes And Incremental Processing on Big Data.
https://hudi.apache.org/
Apache License 2.0
0 stars 0 forks source link

[SUPPORT] Issue with row level deletion not working #252

Open torvalds-dev-testbot[bot] opened 2 months ago

torvalds-dev-testbot[bot] commented 2 months ago

Tips before filing an issue

Describe the problem you faced I am attempting row-level deletion using the write operation delete in my COW table. When executing my PySpark job, no rows are deleted despite confirming correct filtering by printing rows before deletion. No errors are reported, prompting me to seek debugging assistance.

To Reproduce Steps to reproduce the behavior:

  1. Execute PySpark job for row-level deletion.
  2. Check if any rows are deleted.

Expected behavior I expect the specified rows to be deleted based on the provided configuration.

Environment Description

Additional context

Stacktrace Add the stacktrace of the error here.

torvalds-dev-testbot[bot] commented 2 months ago

Relevant Sources:

https://www.linkedin.com/pulse/small-talk-apache-hudi-ashok-kumar-kunkala-3ldge/

https://hudi.apache.org/docs/quick-start-guide

https://datacouch.medium.com/getting-started-with-apache-hudi-711b89c107aa

https://apache-hudi.slack.com/archives/C4D716NPQ/p1713562820086629

torvalds-dev-testbot[bot] commented 2 months ago

Hello! 👋 Please wait while I generate the follow up questions; this usually takes about 30 seconds

torvalds-dev-testbot[bot] commented 2 months ago

Based on your scenario of trying to perform row-level deletion in a Hudi table using the write operation delete, it is crucial to understand why the rows are not getting deleted despite no errors being encountered. Here's some guidance to help you debug this scenario:

By leveraging the detailed insights provided in these documentation snippets, you can review your deletion process and potentially identify any discrepancies in your approach that might be causing the rows not to be deleted as expected.

torvalds-dev-testbot[bot] commented 2 months ago

Ticket created successfully. Here is the link to the GitHub issue: https://github.com/numberlabs-developers/hudi/issues/252