krakend / krakend-ce

KrakenD Community Edition: High-performance, stateless, declarative, API Gateway written in Go.
https://www.krakend.io
Apache License 2.0
1.86k stars 443 forks source link

circuit-breaker fails to get triggered by http error codes #895

Closed friedrichtroescher closed 2 weeks ago

friedrichtroescher commented 2 weeks ago

Environment info:

Describe the bug If the python backend causes a timeout, the circuit breaker opens as it should. If the backend returns a status code which is not 20x (as described in the official documentation), the circuit breaker is not triggered.

Your configuration file:

To replicate, you can clone the repo https://github.com/friedrichtroescher/testkrakend

krakend.json:

{
  "version": 3,
  "endpoints": [
    {
      "endpoint": "/api",
      "output_encoding": "no-op",
      "input_query_strings": [
        "*"
      ],
      "input_headers": [
        "*"
      ],
      "backend": [
        {
          "encoding": "no-op",
          "host": [
            "localhost:8081"
          ],
          "url_pattern": "/api",
          "extra_config": {
            "qos/circuit-breaker": {
              "interval": 60,
              "timeout": 60,
              "max_errors": 2,
              "name": "version-endpoint",
              "log_status_change": true
            }
          }
        }
      ],
      "extra_config": {
        "qos/ratelimit/router": {
          "max_rate": 1000,
          "capacity": 100,
          "client_max_rate": 200,
          "client_capacity": 100,
          "every": "10m",
          "strategy": "ip"
        }
      }
    }
  ]
}

app.py:

from flask import Flask, request, abort

app = Flask(__name__)
import time

@app.route('/api')
def hello_world():
    # read query parameter "crash"
    crash = request.args.get('crash')
    if crash == "false":
        # raise exception to crash the application
        return 'Hello World!'

    status = int(request.args.get('status'))
    abort(status)

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=8081)

requirements.txt:

Flask==2.0.1

Dockerfile:

# Use the official Python image from the Docker Hub
FROM python:3.9-slim

# Set the working directory in the container
WORKDIR /app

# Copy the current directory contents into the container at /app
COPY . /app

# Install Flask
RUN pip install flask

# Make port 8081 available to the world outside this container
EXPOSE 8081

# Define environment variable
ENV FLASK_APP=app.py

# Run app.py when the container launches
CMD ["flask", "run", "--host=0.0.0.0", "--port=8081"]

Commands used How did you start the software?

docker build -t flask-app . && docker run -p 8081:8081 flask-app
krakend run -c krakend.json
curl -v "localhost:8080/api?crash=true&status=500"

Expected behavior After the third curl within 60 seconds, the circuit breaker should open.

Logs

krakend run -c krakend.json Parsing configuration file: krakend.json 2024/07/02 15:52:28 KRAKEND INFO: Starting KrakenD v2.6.3 2024/07/02 15:52:28 KRAKEND INFO: Working directory is /Users/friedricht/MaibornWolff/BMW SRPLM Customer Touchpoints/testkrakend 2024/07/02 15:52:28 KRAKEND INFO: Starting the KrakenD instance 2024/07/02 15:52:28 KRAKEND DEBUG: [ENDPOINT: /api] Building the proxy pipe 2024/07/02 15:52:28 KRAKEND DEBUG: [BACKEND: /api] Building the backend pipe 2024/07/02 15:52:28 KRAKEND DEBUG: [BACKEND: /api][CB] Creating the circuit breaker named 'version-endpoint' 2024/07/02 15:52:28 KRAKEND DEBUG: [ENDPOINT: /api] Building the http handler 2024/07/02 15:52:28 KRAKEND DEBUG: [ENDPOINT: /api][JWTSigner] Signer disabled 2024/07/02 15:52:28 KRAKEND DEBUG: [ENDPOINT: /api][Ratelimit] Rate limit enabled. MaxRate: 1.666667, Capacity: 100 2024/07/02 15:52:28 KRAKEND DEBUG: [ENDPOINT: /api][Ratelimit] IP-based rate limit enabled. MaxRate: 0.333333, Capacity: 100 2024/07/02 15:52:28 KRAKEND INFO: [ENDPOINT: /api][JWTValidator] Validator disabled for this endpoint 2024/07/02 15:52:28 KRAKEND INFO: [SERVICE: Gin] Listening on port: 8080 [GIN] 2024/07/02 - 15:52:30 | 500 | 6.345166ms | ::1 | GET "/api?crash=true&status=500" [GIN] 2024/07/02 - 15:52:31 | 500 | 4.436083ms | ::1 | GET "/api?crash=true&status=500" [GIN] 2024/07/02 - 15:52:32 | 500 | 3.456792ms | ::1 | GET "/api?crash=true&status=500" [GIN] 2024/07/02 - 15:52:33 | 500 | 4.512625ms | ::1 | GET "/api?crash=true&status=500" 2024/07/02 15:52:33 KRAKEND DEBUG: [SERVICE: Telemetry] Registering usage stats for Cluster ID bYnAQYy7eLHKdzKDBJOeMCsbB631Syxkmju3Vm4qxpY=

Additional context It can be validated that the circuit breaker can open by killing the python backend. Oddly, this causes an error which does trigger the circuit breaker.

kpacha commented 2 weeks ago

please, check this issue: https://github.com/krakend/krakend-circuitbreaker/issues/12

github-actions[bot] commented 2 weeks ago

An issue like this already exists, please follow it in the other thread


This is an automated comment. Responding to the bot or mentioning it won't have any effect

alombarte commented 2 weeks ago

The documentation has been adjusted for the next release to make it more clear.