Open pykello opened 6 years ago
Shard resource locks need to ensure that updates on a child table and updates on a parent table are serialised.
If a child placement becomes inactive, the parent should also become inactive and vice versa.
@onderkalaci / @marcocitus -- How long would it take to fix this issue?
I'm also noting #1459 as a related issue.
I just chatted with @marcocitus on this. We expect this fix to take about 3-4 weeks of time.
I wanted to share a quick update on this issue.
@marcocitus looked into this in more detail. He noted that if we disallowed direct writes to the partitioned tables (and only allow you to write to the parent table), this fix would take only 1-2 weeks.
We need to check if this requirement works for our customers.
We utilize the run_command_on_colocated_placements
function for colocated rollups. There are reasons that insert into...select...
does not work for our use case, including but not limited to on conflict being unsupported by pg10 for partitioned tables and order by
use in aggregate functions (similar to a window function) not being supported by Citus.
If direct writes to the placements are still allowed, then this will be fine. We have no desire to write directly to partitions (neither on the coordinator nor the partitions of each placement on the workers).
Ran into this with 2 customers who requested this. @ozgune : We can chat in more detail about the context in person.
We ran into similar issue where we are not able to shard a partitioned table with replication factor > 1.
xxxxxx=# select create_distributed_table('table_name', 'column_name');
ERROR: distributing partitioned tables with replication factor greater than 1 is not supported
But after changing the replication_factor to 1, I am able to shard the table. It would be great if its able to shard the partitioned table with replication_factor > 1 as we are looking for HA.
@hlakshmi : Would allowing writes to the parent table (and not to specific partitions) suffice for your use case? The reason I ask its that's an easier fix and would be easy to get to you sooner.
What is a good ETA on this feature, assuming the easy path is taken (no direct writes to partitions at coordinator level, and assuming commands like run_command_on_placements
still function)?
Edit: I'm noting that the original estimation was about 1-2 weeks, many weeks back. I assume this still has not been started, but it's very much desired :)
@samay-sharma Do you have an ETA for this feature? Has it been started? Would you need assistance in implementing it? I’d like to bump the urgency of this. Thanks!
We're currently working on this. So, the ETA is probably whenever 8.0 is out, which shouldn't be too far away. But, obviously, I cannot give an exact date for this since it involves many other things as well.
So, some notes and questions on the requirements:
ALTER TABLE parent DETACH PARTITION partition
followed by DROP TABLE partition
replicate_table_shards()
. @jnicholls how important is that for you? Do you use that rebalancer functionality?run_command_on_colocated_placements ()
is still allowed@onderkalaci : Could you elaborate a bit on the "shard repair functionality would only work on the parent table". Would all failed inserts mark the parent table as invalid and if so, do we just need to repair it?
Are there cases where the child tables may be marked as invalid and the parent table is not even though we allow writes only on the parent tables. If so, how do we address that case?
Are there cases where the child tables may be marked as invalid and the parent table is not even though we allow writes only on the parent tables.
No, the reason we're disallowing writes to partitions is largely to prevent that (and some locking/ordering issues).
What Önder means is that 8.0 is also not going to be smart enough to only repair a single partition after s failure. It will always repair all.
We may not be able to quickly implement the replicate_table_shards(). @jnicholls how important is that for you? Do you use that rebalancer functionality?
@onderkalaci Yes I do use this functionality. But repair is more important than rebalancing in the short term. If you must release that functionality in 8.1 or a subsequent patch build of shard_rebalancer
for me specifically that is fine :)
My obligatory check-in. My company is going through a 6 month stability and reliability process and this is one of the primary issues we have in Citus today that must be addressed. Some folks are pushing to move away from Citus completely, which I am fighting b/c it is not a good idea, but it is blemishes like this that fuel their argument.
Just FYI, we've implemented this feature. But given that we've some limitations (as discussed above), I preferred to keep the issue open.
Once we package Citus 8.0 (which should be very close), you should be able to use it.
great thank you!
On Sat, Oct 13, 2018 at 2:15 PM Önder Kalacı notifications@github.com wrote:
Just FYI, we've implemented this feature. But given that we've some limitations (as discussed above), I preferred to keep the issue open.
Once we package Citus 8.0 (which should be very close), you should be able to use it.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/citusdata/citus/issues/2233#issuecomment-429563623, or mute the thread https://github.com/notifications/unsubscribe-auth/AABPqyFHDXlZzDJemMU-KXVo7Xnp-wXGks5uki3fgaJpZM4UyuQi .
-- Sent from Gmail Mobile
Adding this discussion as a reference: https://github.com/citusdata/citus/pull/5381#discussion_r733638858
This came in a customer conversation. What is required to implement this?
cc: @onderkalaci