Closed spectateur closed 1 year ago
There's nothing really we could do from our side as our client is built on top of an abstraction for interchangeable HTTP clients (OkHttp, Java URL Connection, Apache HTTP Client). I would suggest reaching out to the Apache HTTP Client project with your observation and asking for guidance there.
I created an issue on Apache Jira https://issues.apache.org/jira/browse/HTTPCLIENT-2235
Hello,
We are using spring cloud vault with spring boot application hosted in a cloud provider, our hashicorp vault v1.11.2 is on premise.
We are experiencing randomly, timeouts during renew (in our case 48H lease) or authentication with the vault.
Error detail :
After digging into communications between our vault server and the clients we have seen that spring cloud vault framework reuse TLS sessions and TCP sessions that have been created with the previous renew.
From vault perspective these sessions have been closed accordingly to the vault timeout as vault server didn't receive anymore communications from the client. All network equipments, worker nodes also dropped the original session in kernel table after grace period (TCP end timeout) triggered by receiving FIN from vault server.
So in the following capture you can see a timeout event on 2022-09-18 17:47:08 that triggered a close_notify from client (see frame N°2741), and which is 15sec after the renew request (read-timeout = 15000ms),
The port used during the timeout is the same than the one used for the previous renew 48H before (see frame N° 2725)/
Note that the client does not ACK vault close_notify for TLS / TCP FIN, (frame N°1985 to 1988) but we are not sure that this could explain that the client reuse the same TLS sessions days after, we tested multiple value for RENEW lease, timeout in the spring cloud vault framework without being able to explain why the vault client do not close the session right after the renewal has been completed.
As the vault client is reusing a session that is no longer existing, the worker node initiate a new port (the sequence number 1 in frame N°2726) and because this new TCP session isn't SYN on next hop network equipment, it is dropped with "first packet isn't SYN error message".
Even if it was SYN we would end up reaching the vault with the original port that was FIN 48H ago by the vault server itself.
So here we are exploring ways to fix this in the spring cloud vault framework through a git issue.
Thanks for your help,