Closed alkalinecoffee closed 3 years ago
Hi @alkalinecoffee !
I just tried to repro this issue on my end and after the upgrade of the first DC, the Consul Replicate process running in DC2 was still operational. Here is how I set up my repro:
020/07/21 15:24:37.393702 [DEBUG] (runner) skipping because "apps/monitor/build/healthcheck-path" is already replicated
errors that you mentioned earlier. I also checked the Consul UI on one of the DC2 nodes and saw that the folder I created for the KVs was still there and intact. I even created more KVs in DC1 and saw them being replicated in DC2I'll close this for now since it's been a while since the last response and I've not been able to repro this behavior. But do feel free to drop a comment if you are still seeing this behavior and I can reopen & look into it further
Overview of the Issue
We have three datacenters running
1.7.2
:We were hoping to upgrade to
1.7.3
to avoid the bug described at https://github.com/hashicorp/consul/issues/7396.We use
consul-replicate v0.4.0 (886abcc)
to replicate a subset keys fromus-west-2
into the other datacenters (ieconsul-replicate -prefix "apps@us-west-2"
).us-east-1
running1.7.3
and joined them to the existing 3-node1.7.2
clusterconsul-template
began to fail with invalid configurations (null/missing KVs, etc)1.7.2
Upon investigation, we noticed that the
/apps
folder inus-east-1
no longer appeared in the UI, yetconsul-replicate
was logging out the following lines to syslog:To clear this odd state out, we tried deleting the key path in
us-east-1
:We restarted
consul-replicate
again, but the samealready replicated
messages appeared in the logs. We ended up re-importing the KVs from a backup file which got us back to a healthy state again.Key Takeaways
1.7.3
cluster connected to the1.7.2
clusterconsul-replicate
appeared to be missing in the UI -- all other DC-local KVs remained untouchedconsul-replicate
does not perform any deletes--only inserts/updates1.7.3
stack and fully reverting back to1.7.2
, we continued to have problems with missing KVsconsul-replicate
showed that they existed (even through multiple restarts)consul kv delete /apps
and restartingconsul-replicate
failed to trigger a replicationconsul-replicate
worked as expected once the missing KVs were importedconsul-template
to block when the desired key does not exist, which should have prevented thisconsul-template
saw that keys somehow existed with null/empty values and rendered them out anywayOpen Questions
-force
option forconsul-replicate
to trigger a write no matter what?Operating system and Environment details
Amazon Linux 1, EC2