Open soumilshah1995 opened 4 months ago
did you try disabling cleaning in one writer while enabling it only in the other one. Essentially, in a multi writer set up, its good to have all table services run with just 1 writer while all other writes just do ingestion. can you try that.
Hello,
We have been experimenting with a multi-writer setup and have confirmed that it works perfectly with two writers. The image below shows our sample setup:
To further enhance our setup, we wanted to test running the cleaner in parallel asynchronously. The first run of the cleaner was successful, but subsequent runs have been failing.
In our setup, we have two jobs: u1 and u2.
u1 touches partitions in NY. u2 touches partitions in CA. Both jobs have the following common configurations:
Note : Also tried # "hoodie.clean.automatic": true and false
We tested the setup both with and without the following flags:
Here is our cleaner async configuration:
The cleaner fails when running together with both u1 and u2 jobs.
Logs
U1.py
u2.py
Any insights or suggestions on resolving this issue would be greatly appreciated.