Open problame opened 3 weeks ago
Is this distinct from https://github.com/neondatabase/neon/issues/7797 ?
What we observed here was connection refused
, i.e., not even able to establish TCP connection.
A (very) narrow-minded solution to #7797 may not address connection refused
issue.
But yeah, in spirit this is a dupe of #7797
POST is idempotent as long as it includes a timeline ID -- @Bodobolero, until we make the controller more seamlessly available during restarts (in Q3), can you make your client retry past this class of error?
Context: https://neondb.slack.com/archives/C06K38EB05D/p1718209960490099?thread_ts=1718184799.253779&cid=C06K38EB05D
Problem
In prodlike cloudbench, we have observed that a storcon deployment can, 44s (!) after the storcon logs that it's up again, cause cplane to get
connection refused
errors when it tries to talk to storcon.Analysis
@ololobus :
Impact
When a Cplane client does a POST request, it doesn't retry them when it gets
connection refused
because it doesn't assume idempotency.Example cplane log message
Related