Open TvdW opened 7 years ago
Chatted a bit on IRC, my conclusion:
k, knowing all this, I'd say the fix should be to split all ranges according to the tokens held by _other_ DCs keep all the current logic, but for every determined range, do one more split that'll solve at least the problem that made me file a ticket
Multi DC topology with RF3, DC1 (3 nodes) - DC2 (3 nodes) @TvdW I have is a similar error. It is not clear what this means?
Thanks for figuring this out. I'll look it over in a bit.
On Tue, Feb 27, 2018 at 1:33 AM x0x01 notifications@github.com wrote:
Multi DT topology with RF3, DC1 (3 nodes) - DC2 (3 nodes) @TvdW https://github.com/tvdw I have is a similar error. It is not clear what this means?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/BrianGallew/cassandra_range_repair/issues/51#issuecomment-368786506, or mute the thread https://github.com/notifications/unsubscribe-auth/AAXWS5j9YbCezQtH-PC45FC2eOgnIU7gks5tY73agaJpZM4KIKbA .
I see the same errors having multi DC, 4 + 4 nodes when provided parameter with DC name. Repairs are completing fine without DC name. ( C* 2.2.12, using vnodes)
Hi there,
We are trying to implrement your script for our internal scheduling tool to repair and to use for our future upgrade to Cassandra3.11.2 .
I'm getting the same errors with imprecise repairs with that option --datacenter, whereas i'm not getting any errors when it's not specified.
range_repair.py -v -s 10 -D b -k XX-c XX INFO 2018-11-07 16:00:55,901 get_local_nodes line: 123 : Local nodes: X
INFO 2018-11-07 16:01:00,172 get_ring_tokens line: 166 : Found 1536 tokens INFO 2018-11-07 16:01:00,181 repair line: 578 : [1/256] repairing range (+xxxx, -xxxxxx) in 10 steps forkeyspace X WARNING 2018-11-07 16:01:01,361 call line: 62 : Execution failed. WARNING 2018-11-07 16:01:01,362 call line: 72 : Giving up execution. Failed too many times. ERROR 2018-11-07 16:01:01,362 _repair_range line: 507 : FAILED: 1/256 step 0001 nodetool -h nodeX -p 7199 repair KS CF -pr -full -st +xxxx -et +xxxxx ERROR 2018-11-07 16:01:01,362 _repair_range line: 508 : error: Repair job has failed with the error message: Repair command #204801 failed with error Requested range (xxxxxxx] intersects a local range but is not fully contained in one; this would lead to imprecise repair. keyspace: xxxxxx
Have you looked into it since this issue was open ?
And if yes, shall we use or not this --datacenter option on the node we are repairing with its local datacenter ?
Thanks in advance for your answer, Florian.
command I used:
python range_repair.py -H 127.0.0.1 -s 1 --datacenter DC2
system_auth is:
CREATE KEYSPACE system_auth WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1': '8', 'DC2': '8'} AND durable_writes = true;
The two tokens (9099366847329376090, 9124888514323768492) are also the ones used by range-repair. Those tokens are in DC2, but there's another DC1 token that sits in the middle, 9108060243154565075. When I trigger two individual
nodetool repair
commands ((9099366847329376090 9108060243154565075]
and(9108060243154565075 9124888514323768492]
) for them, it works fine. Only when the two ranges are merged, does it fail.Ironically, not passing
--datacenter
to the script allows repairs to complete.