moov-io / paygate

A RESTful API enabling electronic payments to be submitted and received without a deep understanding payment file specification
http://moov.io
Apache License 2.0
129 stars 31 forks source link

health: stop retrying for N minutes after X failures in a row #616

Open adamdecaf opened 3 years ago

adamdecaf commented 3 years ago

PayGate Version: v0.9.0-dev

Health checks in PayGate are often called on an interval by deployment services. For example Kubernetes has liveness probes that monitor an endpoint on the application. (e.g. GET /live) This endpoint could be called frequently (every 60s for example) so bad connections or problems are discovered quickly.

This causes a problem where SFTP servers have lockout rules when invalid account credentials are tried repeatedly inside of an interval. An example could be 5 failures in 30 minutes, which we could run into easily.

Currently PayGate will attempt to reconnect (e.g. login again) if it does not have a valid connection. This would be the case if the previous attempt failed to authenticate.

What did you expect to see? PayGate should be more aware of trying to prevent lockouts of remote accounts. These can sometimes be difficult to unlock and cause interruptions of service.