superfly / litefs-example

An example of deploying LiteFS on Fly.io.
Apache License 2.0
70 stars 17 forks source link

Cannot switch from static to consul: lease already initialized with different ID #11

Open andig opened 4 months ago

andig commented 4 months ago

I'm new to litefs and following the tutorial. Got docker-compose up and running, now trying through fly.io. Chose static instead of consul lease (https://github.com/superfly/litefs-example/issues/10). After playing with static I'm trying with consul now and have done

fly attach consul

successfully. After

fly app restart

I'm seeing this:

2024-02-24T10:43:41Z app[7842043c462768] ams [info] INFO Starting init (commit: 913ad9c)...
2024-02-24T10:43:41Z app[7842043c462768] ams [info] INFO Mounting /dev/vdb at /var/lib/litefs w/ uid: 0, gid: 0 and chmod 0755
2024-02-24T10:43:41Z app[7842043c462768] ams [info] INFO Resized /var/lib/litefs to 1056964608 bytes
2024-02-24T10:43:41Z app[7842043c462768] ams [info] INFO Preparing to run: `/bin/sh -c litefs mount` as root
2024-02-24T10:43:41Z app[7842043c462768] ams [info] INFO [fly api proxy] listening at /.fly/api
2024-02-24T10:43:41Z app[7842043c462768] ams [info]2024/02/24 10:43:41 listening on [fdaa:0:30b1:a7b:242:c65e:fbdc:2]:22 (DNS: [fdaa::3]:53)
2024-02-24T10:43:41Z app[7842043c462768] ams [info]config file read from /etc/litefs.yml
2024-02-24T10:43:41Z app[7842043c462768] ams [info]LiteFS v0.5.11, commit=63eab529dc3353e8d159e097ffc4caa7badb8cb3
2024-02-24T10:43:41Z app[7842043c462768] ams [info]level=INFO msg="host environment detected" type=fly.io
2024-02-24T10:43:41Z app[7842043c462768] ams [info]level=INFO msg="no backup client configured, skipping"
2024-02-24T10:43:41Z app[7842043c462768] ams [info]level=INFO msg="Using Consul to determine primary"
2024-02-24T10:43:41Z runner[7842043c462768] ams [info]Machine started in 433ms
2024-02-24T10:43:41Z app[7842043c462768] ams [info]level=INFO msg="initializing consul: key=litefs/litefs-example-2 url=https://:5904af93-72c2-71c6-72bb-ffd4a8880f51@consul-fra-5.fly-shared.net/litefs-example-2-4nml16wm6xyqvp2y/ hostname=7842043c462768 advertise-url=http://7842043c462768.vm.litefs-example-2.internal:20202"
2024-02-24T10:43:41Z app[7842043c462768] ams [info]level=INFO msg="wal-sync: no wal file exists on \"db\", skipping sync with ltx"
2024-02-24T10:43:41Z app[7842043c462768] ams [info]level=INFO msg="using existing cluster id: \"LFSC3F35D86D59BC8183\""
2024-02-24T10:43:41Z app[7842043c462768] ams [info]level=INFO msg="LiteFS mounted to: /litefs"
2024-02-24T10:43:41Z app[7842043c462768] ams [info]level=INFO msg="http server listening on: http://localhost:20202"
2024-02-24T10:43:41Z app[7842043c462768] ams [info]level=INFO msg="waiting to connect to cluster"
2024-02-24T10:43:41Z app[e784966ce65358] ams [info] INFO Sending signal SIGINT to main child process w/ PID 313
2024-02-24T10:43:41Z app[7842043c462768] ams [info]level=INFO msg="06AC78E6CF6E448E: primary lease acquired, advertising as http://7842043c462768.vm.litefs-example-2.internal:20202"
2024-02-24T10:43:41Z app[7842043c462768] ams [info]level=INFO msg="connected to cluster, ready"
2024-02-24T10:43:41Z app[7842043c462768] ams [info]level=INFO msg="node is a candidate, automatically promoting to primary"
2024-02-24T10:43:41Z app[7842043c462768] ams [info]level=INFO msg="node is already primary, skipping promotion"
2024-02-24T10:43:41Z app[7842043c462768] ams [info]level=INFO msg="proxy server listening on: http://localhost:8080"
2024-02-24T10:43:41Z app[7842043c462768] ams [info]level=INFO msg="starting background subprocess: litefs-example [-addr :8081 -dsn /litefs/db]"
2024-02-24T10:43:41Z app[7842043c462768] ams [info]waiting for signal or subprocess to exit
2024-02-24T10:43:41Z app[7842043c462768] ams [info]database opened at /litefs/db
2024-02-24T10:43:41Z app[7842043c462768] ams [info]http server listening on :8081

2024-02-24T10:43:53Z app[e784966ce65358] ams [info] INFO Starting init (commit: 913ad9c)...
2024-02-24T10:43:53Z app[e784966ce65358] ams [info] INFO Mounting /dev/vdb at /var/lib/litefs w/ uid: 0, gid: 0 and chmod 0755
2024-02-24T10:43:53Z app[e784966ce65358] ams [info] INFO Resized /var/lib/litefs to 1069547520 bytes
2024-02-24T10:43:53Z app[e784966ce65358] ams [info] INFO Preparing to run: `/bin/sh -c litefs mount` as root
2024-02-24T10:43:53Z app[e784966ce65358] ams [info] INFO [fly api proxy] listening at /.fly/api
2024-02-24T10:43:53Z app[e784966ce65358] ams [info]2024/02/24 10:43:53 listening on [fdaa:0:30b1:a7b:c6ef:20ca:30af:2]:22 (DNS: [fdaa::3]:53)
2024-02-24T10:43:53Z app[e784966ce65358] ams [info]config file read from /etc/litefs.yml
2024-02-24T10:43:53Z app[e784966ce65358] ams [info]LiteFS v0.5.11, commit=63eab529dc3353e8d159e097ffc4caa7badb8cb3
2024-02-24T10:43:53Z app[e784966ce65358] ams [info]level=INFO msg="host environment detected" type=fly.io
2024-02-24T10:43:53Z app[e784966ce65358] ams [info]level=INFO msg="no backup client configured, skipping"
2024-02-24T10:43:53Z app[e784966ce65358] ams [info]level=INFO msg="Using Consul to determine primary"
2024-02-24T10:43:53Z runner[e784966ce65358] ams [info]Machine started in 592ms
2024-02-24T10:43:53Z app[e784966ce65358] ams [info]level=INFO msg="initializing consul: key=litefs/litefs-example-2 url=https://:5904af93-72c2-71c6-72bb-ffd4a8880f51@consul-fra-5.fly-shared.net/litefs-example-2-4nml16wm6xyqvp2y/ hostname=e784966ce65358 advertise-url=http://e784966ce65358.vm.litefs-example-2.internal:20202"
2024-02-24T10:43:53Z app[e784966ce65358] ams [info]level=INFO msg="wal-sync: no wal file exists on \"db\", skipping sync with ltx"
2024-02-24T10:43:53Z app[e784966ce65358] ams [info]level=INFO msg="using existing cluster id: \"LFSC85E68DD3249EC869\""
2024-02-24T10:43:53Z app[e784966ce65358] ams [info]level=INFO msg="LiteFS mounted to: /litefs"
2024-02-24T10:43:53Z app[e784966ce65358] ams [info]level=INFO msg="http server listening on: http://localhost:20202"
2024-02-24T10:43:53Z app[e784966ce65358] ams [info]level=INFO msg="waiting to connect to cluster"
2024-02-24T10:43:53Z app[e784966ce65358] ams [info]level=INFO msg="cannot connect, \"consul\" lease already initialized with different ID: LFSC3F35D86D59BC8183"
2024-02-24T10:43:54Z app[e784966ce65358] ams [info]level=INFO msg="cannot connect, \"consul\" lease already initialized with different ID: LFSC3F35D86D59BC8183"
2024-02-24T10:43:55Z app[e784966ce65358] ams [info]level=INFO msg="cannot connect, \"consul\" lease already initialized with different ID: LFSC3F35D86D59BC8183"

The machine that was created first with static lease is not able to join the consul cluster. Apparently, it has stored the old lease somewhere and is still trying to use that?

harveysanders commented 1 month ago

@andig I just had a similar issue and found this blog post helpful: https://fly.io/docs/litefs/disaster-recovery/#cleaner-option-remove-the-wrong-key-from-consul