shlinkio / shlink

The definitive self-hosted URL shortener
https://shlink.io
MIT License
3.31k stars 266 forks source link

Unclean shutdown of shlink and redis sentinel caused shlink not to start #1701

Closed onedr0p closed 1 year ago

onedr0p commented 1 year ago

Hi again, your favorite shlink+redis sentinel user 👋🏼

How Shlink is set up

Env

DEFAULT_DOMAIN: &host ln.devbu.io
DISABLE_TRACKING_FROM: 10.0.0.0/8,172.16.0.0/12,192.168.0.0/16
ENABLE_PERIODIC_VISIT_LOCATE: "true"
IS_HTTPS_ENABLED: "true"
PORT: 80
REDIS_PUB_SUB_ENABLED: "true"
REDIS_SENTINEL_SERVICE: redis-master
REDIS_SERVERS: redis-node-0.redis-headless.default.svc.cluster.local:26379,redis-node-1.redis-headless.default.svc.cluster.local:26379,redis-node-2.redis-headless.default.svc.cluster.local:26379
TIMEZONE: America/New_York

Summary

Shlink is not starting due to the following error, this happened on an unclean "shutdown" of redis and shlink:

❯ k logs shlink-api-75694d9cfb-tgnmw
Defaulted container "shlink-api" out of: shlink-api, init-db (init)
Creating fresh database if needed...
[2023-02-14T21:15:09.469209-05:00] [NULL] Shlink.NOTICE - Failed to acquire the "db:create" lock.

In Lock.php line 123:

  Failed to acquire the "db:create" lock.

In RedisStore.php line 255:

  READONLY You can't write against a read only replica. script: c481d49b4a48e
  48a9eeacd5d51655b5fcbb19925, on @user_script:3.

In Client.php line 354:

  READONLY You can't write against a read only replica. script: c481d49b4a48e
  48a9eeacd5d51655b5fcbb19925, on @user_script:3.

db:create

It looks like the crash caused shlink not to clean up the lock?

onedr0p commented 1 year ago

It is also worth noting that other applications using redis sentinel are working fine.

acelaya commented 1 year ago

Based on the error, it's most probably the same that caused this https://github.com/shlinkio/shlink/issues/1684, just on a different place.

The library used to handle locks probably doesn't support predis 2.x neither, when Redis is used.

I will try to patch it like I did to solve that one, and see if I can contribute support for predis 2 to that component as well.

I'm not sure why it didn't fail so far, though. I guess the unclean shutdown might have forced a very specific path.

acelaya commented 1 year ago

Hmm, it looks like the change I did should actually cover this use case as well, which makes me wonder if there's an actual bug on this specific library, but I need to investigate a bit further.

For now, I have tried to check what's the key used to set-up the lock, in case deleting it helps unblock this. Try looking for db:create or Shlinkio\Shlink\CLI\Command\Db\CreateDatabaseCommand (yes, this key should be prefixed with Shlink:. I need to fix that).

onedr0p commented 1 year ago

Looks like it eventually self-healed overnight. /shrug

acelaya commented 1 year ago

I thought that could happen.

My theory is that the lock was created, and because of the unclean shutdown, it was never released.

The next Shlink instance found it, and that's what triggered the path where this bug happens.

Once the lock expired (I think that happens after 10 minutes), Shlink was able to start again.

I will deprioritize this a bit, as it seems less critical.

rehanone commented 1 year ago

Hi, thank you for your work on this cool project. Can you please provide an update on this issue? I am trying to deploy this app in an EKS cluster with Elasticache Redis instance and postgres DB. Even after waiting for a while, I do not see any change and the deployment keeps failing. Any update on this would be very helpful.

acelaya commented 1 year ago

No progress so far. It needs to be solved here https://github.com/symfony/symfony/issues/49238, if you want to give it a go.

I started something, but didn't have time to continue https://github.com/symfony/symfony/compare/6.3...acelaya-forks:symfony:feature/predis-2-support

onedr0p commented 1 year ago

I am going to close this issue as it hasn't come up since I opened this issue, I'll be sure to re-open with updated information if it happens in the future. Thanks for all you do <3