Canner / WrenAI

🚀 Open-source SQL AI Agent for Text-to-SQL. Make Text2SQL Easy! 🙌
https://getwren.ai/oss
GNU Affero General Public License v3.0
1.73k stars 155 forks source link

feat(wren-ai-service): kubernetes session affinity for ai service #613

Closed paopa closed 1 month ago

paopa commented 1 month ago

This PR introduces support for session affinity settings in the Kubernetes deployment of the AI service. As the AI service is stateful, scaling replicas can lead to increased errors for "query not found" if results are retrieved from the wrong pod. To address this, we are using Istio to manage session affinity and improve reliability.

How to validate?

  1. Scale up the replicas of the service on the Kubernetes cluster:

    kubectl scale deploy wren-ai-service-deployment --replicas=2
  2. Create a test pod:

    echo 'apiVersion: v1
    kind: Pod
    metadata:
      name: test-pod
      namespace: wren
    spec:
      containers:
      - name: curl-container
        image: curlimages/curl:7.85.0
        command: ["/bin/sh", "-c", "while true; do sleep 3600; done"]  
      restartPolicy: Never
    ' | kubectl apply -f -
  3. Access the test pod:

    kubectl exec -it test-pod -n wren -- /bin/sh
  4. Validate the health check endpoint with a header and trace the result in pod log:

    curl -H 'user-session: sid-1' wren-ai-service-svc:5555/health

See Also