nats-io / nats-server

High-Performance server for NATS.io, the cloud and edge native messaging system.
https://nats.io
Apache License 2.0
15.49k stars 1.38k forks source link

Stream Lost Quorum triggered for streams in deleted account as well #3306

Closed sourabhaggrawal closed 1 year ago

sourabhaggrawal commented 2 years ago

Defect

If account is deleted without deleting the streams then on restarting the peer nodes other node starts complaining about 'No Quorum' for streams which belongs to deleted account and never recovers.

Versions of nats-server and affected client libraries used:

2.8.4

OS/Container environment:

ubuntu-20.04

Steps or code to reproduce the issue:

In a 3 node cluster

  1. Create one jetstream enabled account
  2. Create one stream under the new account and publish couple of messages to it.
  3. Deleted the account without deleting streams
  4. Restart any node.

Expected result:

It should restart smoothly and should not log any error related to deleted account streams or account itself.

Actual result:

Other node starts logging "No quorum, stalled" for new stream we just created. The logs does not stop until we restart the node which is complaining.

All accounts mentioned in the logs are deleted account.

[24314] 2022/07/29 10:01:37.935503 [WRN] Account fetch failed: open /var/lib/nats/superadmin/jwt/ACGREIUVT2PT4ERQAFNQFY6SJHMCWH77E3LQKCCCUZF2F7B47ZXBKTH6.jwt: no such file or directory [24314] 2022/07/29 10:01:38.903683 [WRN] Account fetch failed: open /var/lib/nats/superadmin/jwt/ACPU7YPBAVITL5RAZNURPROSASX4N26PZYVOR6ODSQMNFR6MBUI2PPXJ.jwt: no such file or directory [24314] 2022/07/29 10:01:38.907527 [WRN] Account fetch failed: open /var/lib/nats/superadmin/jwt/AC6P4P3CINXZ4MFTQSWXGCJI53BHGZLYMNQFU4SHY2ANOBY2JUXZFRIO.jwt: no such file or directory [24314] 2022/07/29 10:01:40.818752 [WRN] JetStream cluster stream 'ACN5OEYPSG7VDT2HWUUHVEZIKPDPMKKO4VQQBKIIC2BPKUEYKY6TEXQ5 > 5e83a915-c06a-4dd4-9b2a-07e4da85b7ad_stream' has NO quorum, stalled [24314] 2022/07/29 10:01:41.468081 [WRN] Account fetch failed: open /var/lib/nats/superadmin/jwt/AD4DZWATLP6LAHUIO4LTYBOSMH6MCUNXMIYZJDDZJ7T4RBGJ4CFVF7HZ.jwt: no such file or directory [24314] 2022/07/29 10:01:41.564099 [WRN] JetStream cluster stream 'AC6P4P3CINXZ4MFTQSWXGCJI53BHGZLYMNQFU4SHY2ANOBY2JUXZFRIO > db68f9c1-3e11-48ee-b75c-e9f4fd94a1e0_stream' has NO quorum, stalled [24314] 2022/07/29 10:01:41.834412 [WRN] JetStream cluster stream 'ACBPFKJWLJJYLV3DOJYRZXTNV43HJHTXQY2SQ5ZTI4XDDQ3KJY73CSXE > 66697cca-b07d-47ad-b918-5059121667cf_stream' has NO quorum, stalled

caleblloyd commented 1 year ago

Fix is to issue the $JS.API.ACCOUNT.PURGE.<AID> endpoint to delete account data upon account deletion

Reference #3319