hashicorp / vault

A tool for secrets management, encryption as a service, and privileged access management
https://www.vaultproject.io/
Other
30.92k stars 4.18k forks source link

Mysql Hashicorp Vault in HA leaving multiple sleep connections with Mysql #10318

Open sksingh20 opened 3 years ago

sksingh20 commented 3 years ago

Setup: Two Vault nodes configured in HA with one mysql storage backend configured with autounseal with transit method.

Vault HA standby node making multiple connection with Mysql storage backend and leaving connection in sleep mode. This consume all available connection and mysql throughs 1040 error on reaching max connections.

MariaDB [mysql]> show processlist; ±-----±------------±--------------------±------±--------±-----±-------------------------±-----------------±---------+ | Id | User | Host | db | Command | Time | State | Info | Progress | ±-----±------------±--------------------±------±--------±-----±-------------------------±-----------------±---------+ | 1 | system user | | NULL | Daemon | NULL | InnoDB purge coordinator | NULL | 0.000 | | 2 | system user | | NULL | Daemon | NULL | InnoDB purge worker | NULL | 0.000 | | 3 | system user | | NULL | Daemon | NULL | InnoDB purge worker | NULL | 0.000 | | 4 | system user | | NULL | Daemon | NULL | InnoDB purge worker | NULL | 0.000 | | 5 | system user | | NULL | Daemon | NULL | InnoDB shutdown handler | NULL | 0.000 | | 6324 | root | localhost | mysql | Query | 0 | Init | show processlist | 0.000 | | 6833 | uvault | 192.168.0.213:45720 | NULL | Sleep | 0 | | NULL | 0.000 | | 6834 | uvault | 192.168.0.213:45730 | NULL | Sleep | 385 | | NULL | 0.000 | | 6835 | uvault | 192.168.0.213:45732 | NULL | Sleep | 385 | | NULL | 0.000 | | 6838 | uvault | 192.168.0.213:45736 | NULL | Sleep | 5 | | NULL | 0.000 | | 6852 | uvault | serlog:56304 | NULL | Sleep | 0 | | NULL | 0.000 | | 6853 | uvault | serlog:56306 | NULL | Sleep | 0 | | NULL | 0.000 | | 6854 | uvault | serlog:56316 | NULL | Sleep | 370 | | NULL | 0.000 | | 6855 | uvault | serlog:56320 | NULL | Sleep | 360 | | NULL | 0.000 | | 6856 | uvault | serlog:56322 | NULL | Sleep | 350 | | NULL | 0.000 | | 6857 | uvault | serlog:56324 | NULL | Sleep | 340 | | NULL | 0.000 | | 6858 | uvault | serlog:56326 | NULL | Sleep | 330 | | NULL | 0.000 | | 6859 | uvault | serlog:56328 | NULL | Sleep | 320 | | NULL | 0.000 | | 6860 | uvault | serlog:56330 | NULL | Sleep | 310 | | NULL | 0.000 | | 6861 | uvault | serlog:56332 | NULL | Sleep | 300 | | NULL | 0.000 | | 6862 | uvault | serlog:56334 | NULL | Sleep | 290 | | NULL | 0.000 | | 6863 | uvault | serlog:56336 | NULL | Sleep | 280 | | NULL | 0.000 | | 6864 | uvault | serlog:56338 | NULL | Sleep | 270 | | NULL | 0.000 | | 6865 | uvault | serlog:56340 | NULL | Sleep | 260 | | NULL | 0.000 | | 6866 | uvault | serlog:56342 | NULL | Sleep | 250 | | NULL | 0.000 | | 6867 | uvault | serlog:56344 | NULL | Sleep | 240 | | NULL | 0.000 | | 6868 | uvault | serlog:56346 | NULL | Sleep | 230 | | NULL | 0.000 | | 6869 | uvault | serlog:56348 | NULL | Sleep | 220 | | NULL | 0.000 | | 6870 | uvault | serlog:56354 | NULL | Sleep | 210 | | NULL | 0.000 | | 6871 | uvault | serlog:56356 | NULL | Sleep | 200 | | NULL | 0.000 | | 6872 | uvault | serlog:56358 | NULL | Sleep | 190 | | NULL | 0.000 | | 6873 | uvault | serlog:56360 | NULL | Sleep | 180 | | NULL | 0.000 | | 6874 | uvault | serlog:56362 | NULL | Sleep | 170 | | NULL | 0.000 | | 6875 | uvault | serlog:56364 | NULL | Sleep | 160 | | NULL | 0.000 | | 6876 | uvault | serlog:56366 | NULL | Sleep | 150 | | NULL | 0.000 | | 6877 | uvault | serlog:56368 | NULL | Sleep | 140 | | NULL | 0.000 | | 6878 | uvault | serlog:56370 | NULL | Sleep | 130 | | NULL | 0.000 | | 6879 | uvault | serlog:56372 | NULL | Sleep | 120 | | NULL | 0.000 | | 6880 | uvault | serlog:56374 | NULL | Sleep | 110 | | NULL | 0.000 | | 6881 | uvault | serlog:56376 | NULL | Sleep | 100 | | NULL | 0.000 | | 6882 | uvault | serlog:56378 | NULL | Sleep | 90 | | NULL | 0.000 | | 6883 | uvault | serlog:56380 | NULL | Sleep | 80 | | NULL | 0.000 | | 6884 | uvault | serlog:56382 | NULL | Sleep | 70 | | NULL | 0.000 | | 6885 | uvault | serlog:56384 | NULL | Sleep | 60 | | NULL | 0.000 | | 6886 | uvault | serlog:56386 | NULL | Sleep | 50 | | NULL | 0.000 | | 6887 | uvault | serlog:56388 | NULL | Sleep | 40 | | NULL | 0.000 | | 6888 | uvault | serlog:56390 | NULL | Sleep | 30 | | NULL | 0.000 | | 6889 | uvault | serlog:56392 | NULL | Sleep | 20 | | NULL | 0.000 | | 6890 | uvault | serlog:56394 | NULL | Sleep | 10 | | NULL | 0.000 | | 6891 | uvault | serlog:56396 | NULL | Sleep | 0 | | NULL | 0.000 | ±-----±------------±--------------------±------±--------±-----±-------------------------±-----------------±---------+ 50 rows in set (0.001 sec)

MariaDB [mysql]>

Following configuration value not working Max_idle_connection value of no effect.

storage “mysql” { database = “vault” username = “uvault” password = “vau20@19” address = “192.168.0.212:3306” ha_enabled = “true” lock_table = “vault_lock” max_idle_connections = “10” max_connection_lifetime = “360000” }

Only Max_connection_lifetime is working. but making this value small is switching active node on expiration of max_connection_lifetime duration.

AS comment input on hashicorp page, this says to increase timeout which appears to be further negatively impact this situation.

High Availability – the MySQL storage backend supports high availability. Note that due to the way mysql locking functions work they are lost if a connection dies. If you would like to not have frequent changes in your elected leader you can increase interactive_timeout and wait_timeout MySQL config to much higher than default which is set at 8 hours.

How to restrict making these connections which are left without properly closing and left as sleep connection.

calvn commented 3 years ago

How many nodes are in your cluster, and what happens if you decrease the number of max_idle_connections?

sksingh20 commented 3 years ago

Two Nodes in HA... as max_idle connection is reduced... only those number of Mysql connections gets some activity from vault and so idle time for them will remain in range between 0-20 seconds... rest all connection in sleep mode

ShriprasadM commented 3 years ago

I am also facing the same issue. I have observed another issue as follows. Not sure, if this relates to this problem

  1. First time when I connected 2 vault nodes with MySQL HA cluster it was able to elect the leader and I was able to see entry in lock table.
  2. Vault status was like this

    Key Value


    Seal Type shamir Initialized true Sealed false Total Shares 5 Threshold 3 Version 1.5.4 Cluster Name vault-cluster-c55642a8 Cluster ID 19a1676c-7402-4d83-875f-ebd41ca04895 HA Enabled true HA Cluster https://vault-0.vault-internal:8201 HA Mode active

  3. But after 10-15 minutes I got following on running vault status command

    Key Value


    Seal Type shamir Initialized true Sealed false Total Shares 5 Threshold 3 Version 1.5.4 Cluster Name vault-cluster-c55642a8 Cluster ID 19a1676c-7402-4d83-875f-ebd41ca04895 HA Enabled true HA Cluster
    HA Mode standby

sksingh20 commented 3 years ago

Please help.. This issues was with vault 1.5.4 After upgrading Vault to 1.6.3... even max_connection_lifetime have no effect... and active node remains active for long. as a result standby node died with following message:

Apr 3 09:26:44 serlog vault[31016]: 2021-04-03T09:26:44.456+0530 [ERROR] expiration: error restoring leases: error="failed to scan for leases: list failed at path "": failed to execute statement: Error 1040: Too many connections" Apr 3 09:26:44 serlog vault[31016]: 2021-04-03T09:26:44.456+0530 [ERROR] core: shutting down Apr 3 09:26:44 serlog vault[31016]: 2021-04-03T09:26:44.456+0530 [INFO] core: marked as sealed Apr 3 09:26:44 serlog vault[31016]: 2021-04-03T09:26:44.459+0530 [INFO] core: pre-seal teardown starting Apr 3 09:26:44 serlog vault[31016]: 2021-04-03T09:26:44.459+0530 [INFO] rollback: stopping rollback manager Apr 3 09:26:44 serlog vault[31016]: 2021-04-03T09:26:44.472+0530 [INFO] core: pre-seal teardown complete Apr 3 09:26:44 serlog vault[31016]: 2021-04-03T09:26:44.472+0530 [ERROR] core: post-unseal setup failed: error="failed to read packed storage entry: Error 1040: Too many connections" Apr 3 09:26:44 serlog vault[31016]: 2021-04-03T09:26:44.473+0530 [INFO] core: stopping cluster listeners Apr 3 09:26:44 serlog vault[31016]: 2021-04-03T09:26:44.473+0530 [INFO] core.cluster-listener: forwarding rpc listeners stopped Apr 3 09:26:44 serlog vault[31016]: 2021-04-03T09:26:44.482+0530 [INFO] core.cluster-listener: rpc listeners successfully shut down Apr 3 09:26:44 serlog vault[31016]: 2021-04-03T09:26:44.482+0530 [INFO] core: cluster listeners successfully shut down Apr 3 09:26:44 serlog vault[31016]: 2021-04-03T09:26:44.483+0530 [INFO] core: vault is sealed

sksingh20 commented 3 years ago

when standby node vault service restarted, then all sleeps connection to mysql dropped... then again sleep services keep growing until max connection limit per user set at mysql level

sksingh20 commented 3 years ago

any hint... what may be going wrong with lock table by vault nodes?

ShriprasadM commented 3 years ago

any update here?

sksingh20 commented 3 years ago

Standby node connection remain in established condition when response goes back as lock already acquired. Any update on this?

pi@serlog:~ $ sudo netstat | grep mysql |grep -i serlog|grep -i established tcp 0 0 serlog:mysql 192.168.0.213:58760 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58762 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58834 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58818 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58822 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58694 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58728 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58790 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58736 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58668 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58688 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58750 ESTABLISHED tcp 0 0 serlog:57480 serlog:mysql ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58714 ESTABLISHED tcp 0 0 serlog:57482 serlog:mysql ESTABLISHED tcp 0 0 serlog:mysql serlog:46002 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58800 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58734 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58784 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58708 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58842 ESTABLISHED tcp 0 0 serlog:mysql serlog:57482 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58840 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58686 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58792 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58812 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58810 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58700 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58774 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58742 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58756 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58768 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58744 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58780 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58680 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58766 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58832 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58816 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58786 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58806 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58838 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58682 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58716 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58828 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58676 ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58722 ESTABLISHED tcp 0 0 serlog:46002 serlog:mysql ESTABLISHED tcp 0 0 serlog:mysql 192.168.0.213:58710 ESTABLISHED tcp 0 0 serlog:mysql serlog:57480 ESTABLISHED pi@serlog:~ $

sksingh20 commented 3 years ago

On standby vault node I have following output when I trace a connection

pi@vaultf2:~ $ sudo lsof -i tcp:59860 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME vault 9180 root 78u IPv4 63158 0t0 TCP 192.168.0.213:59860->192.168.0.212:mysql (ESTABLISHED) pi@vaultf2:~ $

sksingh20 commented 3 years ago

No one having any solution?

sksingh20 commented 3 years ago

Absolutely no progress from last six months is really matter of concern on usage of vault in production setup!! Appears like working on edges.

sksingh20 commented 3 years ago

If this is correct platform for given issue or need to raise issue in some other forum?

mladlow commented 3 years ago

Hi @sksingh20,

We’re sorry to hear that you’re encountering this problem. The MySQL storage backend is community supported (https://www.vaultproject.io/docs/configuration/storage/mysql).

One thing that would be helpful for this issue in particular would be some reproduction steps with a minimal environment. A reproduction makes it much easier for someone to find the source of a bug and create a PR.

You could also open a PR to attempt to fix the issue yourself. To help us review that PR, it should include a test that will fail without the code changes. At the moment, we’re not confident enough of where the problem might be happening to propose a place to start looking in the code.

sksingh20 commented 3 years ago

@mladlow Thank you for your response.

Following are steps to reproduce this issue: (1) Create two nodes in HA with Mysql as storage backend (2) And you will observed that standby keep checking for lock in mysql DB. (3) As standby node finds lock is already held, standby node report same, BUT leaves connection to vault in established state. This keep on increasing number of established connection made by standby node.

Message by vault core: failed to acquire lock: error="mysql: lock already held"

Expected Behavior: Standby node should have dropped connection as its response on lock already held. But standby Vault is not dropping connection with mysql and connection remains established state.

Even if vault becomes active node, still this is not dropping old orphaned established connections.

pi@vaultf2:~ $ sudo netstat -n grep -i "192.168.0.212:3306" tcp 0 0 192.168.0.213:52522 192.168.0.212:3306 ESTABLISHED tcp 0 0 192.168.0.213:52476 192.168.0.212:3306 ESTABLISHED tcp 0 0 192.168.0.213:52464 192.168.0.212:3306 ESTABLISHED tcp 0 0 192.168.0.213:52506 192.168.0.212:3306 ESTABLISHED tcp 0 0 192.168.0.213:52472 192.168.0.212:3306 ESTABLISHED tcp 0 0 192.168.0.213:52424 192.168.0.212:3306 ESTABLISHED tcp 0 0 192.168.0.213:52488 192.168.0.212:3306 ESTABLISHED tcp 0 0 192.168.0.213:52510 192.168.0.212:3306 ESTABLISHED tcp 0 0 192.168.0.213:52532 192.168.0.212:3306 ESTABLISHED tcp 0 0 192.168.0.213:52448 192.168.0.212:3306 ESTABLISHED tcp 0 0 192.168.0.213:52502 192.168.0.212:3306 ESTABLISHED tcp 0 0 192.168.0.213:52530 192.168.0.212:3306 ESTABLISHED tcp 0 0 192.168.0.213:52454 192.168.0.212:3306 ESTABLISHED tcp 0 0 192.168.0.213:52460 192.168.0.212:3306 ESTABLISHED tcp 0 0 192.168.0.213:52484 192.168.0.212:3306 ESTABLISHED tcp 0 0 192.168.0.213:52638 192.168.0.212:3306 ESTABLISHED tcp 0 0 192.168.0.213:52458 192.168.0.212:3306 ESTABLISHED tcp 0 0 192.168.0.213:52428 192.168.0.212:3306 ESTABLISHED tcp 0 0 192.168.0.213:52412 192.168.0.212:3306 ESTABLISHED tcp 0 0 192.168.0.213:52516 192.168.0.212:3306 ESTABLISHED tcp 0 0 192.168.0.213:52418 192.168.0.212:3306 ESTABLISHED tcp 0 0 192.168.0.213:52438 192.168.0.212:3306 ESTABLISHED tcp 0 0 192.168.0.213:52434 192.168.0.212:3306 ESTABLISHED tcp 0 0 192.168.0.213:52492 192.168.0.212:3306 ESTABLISHED tcp 0 0 192.168.0.213:52498 192.168.0.212:3306 ESTABLISHED tcp 0 0 192.168.0.213:52618 192.168.0.212:3306 TIME_WAIT tcp 0 0 192.168.0.213:52444 192.168.0.212:3306 ESTABLISHED tcp 0 0 192.168.0.213:52480 192.168.0.212:3306 ESTABLISHED pi@vaultf2:~ $ pi@vaultf2:~ $ export VAULT_ADDR="http://127.0.0.1:8200" pi@vaultf2:~ $ vault status Key Value

Recovery Seal Type shamir Initialized true Sealed false Total Recovery Shares 5 Threshold 3 Version 1.7.1 Storage Type mysql Cluster Name vault-cluster-b827622c Cluster ID 3bc07ca8-e1d3-5698-c380-cb7b99e9639e HA Enabled true HA Cluster https://192.168.0.213:8251 HA Mode active Active Since 2021-05-28T06:05:01.625348456Z pi@vaultf2:~ $

sksingh20 commented 3 years ago

As a result I have to configure timeout value at mysql to terminate idle connections at mysql level. This caused another issue.

New issue is due to vault behaviour where active node is also keeping one idle connection for long. This long established connection is now terminated as this reaches timeout threshold and so lock is released and acquired by other node.

This workaround to terminate idle connection with timeout is atleast making vault service live but overall system became very vague because now vault nodes keep switching between few minutes due to mysql terminating idle connections.

May 28 11:34:29 vaultf2 vault[9925]: 2021-05-28T11:34:29.837+0530 [ERROR] core: failed to acquire lock: error="mysql: lock already held" May 28 11:34:39 vaultf2 vault[9925]: 2021-05-28T11:34:39.890+0530 [ERROR] core: failed to acquire lock: error="mysql: lock already held" May 28 11:34:49 vaultf2 vault[9925]: 2021-05-28T11:34:49.925+0530 [ERROR] core: failed to acquire lock: error="mysql: lock already held" May 28 11:35:00 vaultf2 vault[9925]: 2021-05-28T11:35:00.005+0530 [INFO] core: acquired lock, enabling active operation May 28 11:35:01 vaultf2 vault[9925]: 2021-05-28T11:35:01.607+0530 [INFO] core: post-unseal setup starting May 28 11:35:01 vaultf2 vault[9925]: 2021-05-28T11:35:01.641+0530 [INFO] core: loaded wrapping token key May 28 11:35:01 vaultf2 vault[9925]: 2021-05-28T11:35:01.646+0530 [INFO] core: successfully setup plugin catalog: plugin-directory= May 28 11:35:01 vaultf2 vault[9925]: 2021-05-28T11:35:01.675+0530 [INFO] core: successfully mounted backend: type=system path=sys/ May 28 11:35:01 vaultf2 vault[9925]: 2021-05-28T11:35:01.683+0530 [INFO] core: successfully mounted backend: type=identity path=identity/ May 28 11:35:01 vaultf2 vault[9925]: 2021-05-28T11:35:01.694+0530 [INFO] core: successfully mounted backend: type=kv path=secret/ May 28 11:35:01 vaultf2 vault[9925]: 2021-05-28T11:35:01.699+0530 [INFO] core: successfully mounted backend: type=cubbyhole path=cubbyhole/ May 28 11:35:01 vaultf2 vault[9925]: 2021-05-28T11:35:01.769+0530 [INFO] core: successfully enabled credential backend: type=token path=token/ May 28 11:35:01 vaultf2 vault[9925]: 2021-05-28T11:35:01.774+0530 [INFO] core: successfully enabled credential backend: type=userpass path=userpass/ May 28 11:35:01 vaultf2 vault[9925]: 2021-05-28T11:35:01.784+0530 [INFO] core: successfully enabled credential backend: type=approle path=approle/ May 28 11:35:01 vaultf2 vault[9925]: 2021-05-28T11:35:01.809+0530 [INFO] core: restoring leases May 28 11:35:01 vaultf2 vault[9925]: 2021-05-28T11:35:01.813+0530 [INFO] rollback: starting rollback manager May 28 11:35:01 vaultf2 vault[9925]: 2021-05-28T11:35:01.833+0530 [WARN] core.cluster-listener: no TLS config found for ALPN: ALPN=[req_fw_sb-act_v1] May 28 11:35:01 vaultf2 vault[9925]: 2021-05-28T11:35:01.893+0530 [INFO] expiration: lease restore complete May 28 11:35:01 vaultf2 vault[9925]: 2021-05-28T11:35:01.921+0530 [INFO] identity: entities restored May 28 11:35:01 vaultf2 vault[9925]: 2021-05-28T11:35:01.929+0530 [INFO] identity: groups restored May 28 11:35:02 vaultf2 vault[9925]: 2021-05-28T11:35:02.079+0530 [INFO] core: post-unseal setup complete May 28 11:35:17 vaultf2 rngd[255]: stats: bits received from HRNG source: 6460064 May 28 11:35:17 vaultf2 rngd[255]: stats: bits sent to kernel pool: 6418944 May 28 11:35:17 vaultf2 rngd[255]: stats: entropy added to kernel pool: 6418944 May 28 11:35:17 vaultf2 rngd[255]: stats: FIPS 140-2 successes: 323 May 28 11:35:17 vaultf2 rngd[255]: stats: FIPS 140-2 failures: 0 May 28 11:35:17 vaultf2 rngd[255]: stats: FIPS 140-2(2001-10-10) Monobit: 0 May 28 11:35:17 vaultf2 rngd[255]: stats: FIPS 140-2(2001-10-10) Poker: 0 May 28 11:35:17 vaultf2 rngd[255]: stats: FIPS 140-2(2001-10-10) Runs: 0 May 28 11:35:17 vaultf2 rngd[255]: stats: FIPS 140-2(2001-10-10) Long run: 0 May 28 11:35:17 vaultf2 rngd[255]: stats: FIPS 140-2(2001-10-10) Continuous run: 0 May 28 11:35:17 vaultf2 rngd[255]: stats: HRNG source speed: (min=119.952; avg=692.161; max=979.255)Kibits/s May 28 11:35:17 vaultf2 rngd[255]: stats: FIPS tests speed: (min=664.035; avg=6234.626; max=9109.725)Kibits/s May 28 11:35:17 vaultf2 rngd[255]: stats: Lowest ready-buffers level: 2 May 28 11:35:17 vaultf2 rngd[255]: stats: Entropy starvations: 0 May 28 11:35:17 vaultf2 rngd[255]: stats: Time spent starving for entropy: (min=0; avg=0.000; max=0)us May 28 11:38:02 vaultf2 systemd[1]: Created slice User Slice of UID 1000. May 28 11:38:02 vaultf2 systemd[1]: Starting User Runtime Directory /run/user/1000... May 28 11:38:03 vaultf2 systemd[1]: Started User Runtime Directory /run/user/1000. May 28 11:38:03 vaultf2 systemd[1]: Starting User Manager for UID 1000... May 28 11:38:04 vaultf2 systemd[5344]: Listening on GnuPG cryptographic agent and passphrase cache. May 28 11:38:04 vaultf2 systemd[5344]: Listening on GnuPG cryptographic agent and passphrase cache (access for web browsers). May 28 11:38:04 vaultf2 systemd[5344]: Reached target Timers. May 28 11:38:04 vaultf2 systemd[5344]: Listening on GnuPG cryptographic agent (ssh-agent emulation). May 28 11:38:04 vaultf2 systemd[5344]: Starting D-Bus User Message Bus Socket. May 28 11:38:04 vaultf2 systemd[5344]: Listening on GnuPG cryptographic agent and passphrase cache (restricted). May 28 11:38:04 vaultf2 systemd[5344]: Listening on GnuPG network certificate management daemon. May 28 11:38:04 vaultf2 systemd[5344]: Reached target Paths. May 28 11:38:04 vaultf2 systemd[5344]: Listening on D-Bus User Message Bus Socket. May 28 11:38:04 vaultf2 systemd[5344]: Reached target Sockets. May 28 11:38:04 vaultf2 systemd[5344]: Reached target Basic System. May 28 11:38:04 vaultf2 systemd[1]: Started User Manager for UID 1000. May 28 11:38:04 vaultf2 systemd[5344]: Reached target Default. May 28 11:38:04 vaultf2 systemd[5344]: Startup finished in 909ms. May 28 11:38:04 vaultf2 systemd[1]: Started Session 245 of user pi. May 28 11:45:06 vaultf2 vault[9925]: 2021-05-28T11:45:06.166+0530 [WARN] core: leadership lost, stopping active operation May 28 11:45:06 vaultf2 vault[9925]: 2021-05-28T11:45:06.171+0530 [INFO] core: pre-seal teardown starting May 28 11:45:06 vaultf2 vault[9925]: 2021-05-28T11:45:06.672+0530 [INFO] rollback: stopping rollback manager May 28 11:45:06 vaultf2 vault[9925]: 2021-05-28T11:45:06.681+0530 [INFO] core: pre-seal teardown complete May 28 11:45:06 vaultf2 vault[9925]: 2021-05-28T11:45:06.715+0530 [ERROR] core: unlocking HA lock failed: error="mysql: unable to release lock, already released or not held by this session" May 28 11:45:07 vaultf2 vault[9925]: 2021-05-28T11:45:07.843+0530 [ERROR] core: failed to acquire lock: error="mysql: lock already held" May 28 11:45:17 vaultf2 vault[9925]: 2021-05-28T11:45:17.900+0530 [ERROR] core: failed to acquire lock: error="mysql: lock already held" May 28 11:45:27 vaultf2 vault[9925]: 2021-05-28T11:45:27.935+0530 [ERROR] core: failed to acquire lock: error="mysql: lock already held" May 28 11:45:37 vaultf2 vault[9925]: 2021-05-28T11:45:37.987+0530 [ERROR] core: failed to acquire lock: error="mysql: lock already held" May 28 11:45:48 vaultf2 vault[9925]: 2021-05-28T11:45:48.049+0530 [ERROR] core: failed to acquire lock: error="mysql: lock already held" May 28 11:45:58 vaultf2 vault[9925]: 2021-05-28T11:45:58.105+0530 [ERROR] core: failed to acquire lock: error="mysql: lock already held" May 28 11:46:08 vaultf2 vault[9925]: 2021-05-28T11:46:08.158+0530 [ERROR] core: failed to acquire lock: error="mysql: lock already held" May 28 11:46:18 vaultf2 vault[9925]: 2021-05-28T11:46:18.202+0530 [ERROR] core: failed to acquire lock: error="mysql: lock already held" May 28 11:46:28 vaultf2 vault[9925]: 2021-05-28T11:46:28.237+0530 [ERROR] core: failed to acquire lock: error="mysql: lock already held" May 28 11:46:38 vaultf2 vault[9925]: 2021-05-28T11:46:38.286+0530 [ERROR] core: failed to acquire lock: error="mysql: lock already held" May 28 11:46:48 vaultf2 vault[9925]: 2021-05-28T11:46:48.327+0530 [ERROR] core: failed to acquire lock: error="mysql: lock already held" May 28 11:46:58 vaultf2 vault[9925]: 2021-05-28T11:46:58.372+0530 [ERROR] core: failed to acquire lock: error="mysql: lock already held"

sksingh20 commented 3 years ago

any help will be appreciated... This is simple change in Vault code to close MySQL connection on standby node on return response of lock already held.

sksingh20 commented 3 years ago

No response yet. If this is bug of product then please confirm,

sksingh20 commented 3 years ago

any help?

ShriprasadM commented 2 years ago

any updates here?

heatherezell commented 2 years ago

As @mladlow mentioned, the MySQL plugin is community-supported, meaning that Vault engineering teams do not create or update them. As such, we rely on the open source community to maintain and update these plugins. Please feel free to file a PR if you can identify a fix for this issue. Thanks!

sksingh20 commented 1 year ago

Please suggest any solution as mysql as product is stable... Vault is making connection to mysql.. vault is expected to close connection which vault is not closing... Please look into this..

heatherezell commented 6 months ago

Hi folks! Is this still an issue in newer versions of Vault? Please let me know so I can bubble it up accordingly. Thanks!

ebuildy commented 5 months ago

Look like multiple nodes try to get the same lock, hence the timeout then the crash:

Capture d’écran, le 2024-04-21 à 17 13 18
mrnonz commented 1 month ago

Look like multiple nodes try to get the same lock, hence the timeout then the crash:

Capture d’écran, le 2024-04-21 à 17 13 18

Hi @ebuildy, Which version of vault that get same lock contention?