sweepai / sweep

Sweep: open-source AI-powered Software Developer for small features and bug fixes.
https://sweep.dev
Other
7.43k stars 426 forks source link

Sweep: Connection closed by server (Redis). #3837

Closed kevinlu1248 closed 5 months ago

kevinlu1248 commented 5 months ago

sweepai/core/vector_db.py in openai_with_expo_backoff at line 207

        return openai_call_embedding(batch)
    # check cache first
    embeddings = [None] * len(batch)
    cache_keys = [hash_sha256(text) + CACHE_VERSION for text in batch]
    try:
        for i, cache_value in enumerate(redis_client.mget(cache_keys)):
            if cache_value:
                embeddings[i] = np.array(json.loads(cache_value))
    except Exception as e:
        logger.exception(e)
    # not stored in cache call openai

Make this more reliable by adding longer timeouts and backoff for redis queries

sweep-nightly[bot] commented 5 months ago

🚀 Here's the PR! #3841

💎 Sweep Pro: You have unlimited Sweep issues

Actions

Relevant files (click to expand). Mentioned files will always appear here. https://github.com/sweepai/sweep/blob/ca2e86cca5392b99d8c8637e383a2d08cd40b082/sweepai/core/vector_db.py#L1-L258 https://github.com/sweepai/sweep/blob/ca2e86cca5392b99d8c8637e383a2d08cd40b082/sweepai/config/server.py#L1-L209

Step 2: ⌨️ Coding

sweepai/core/vector_db.py

Add the necessary imports for the backoff and Redis timeout functionality.
--- 
+++ 
@@ -9,6 +9,7 @@
 import requests
 from loguru import logger
 from redis import Redis
+from redis.exceptions import TimeoutError
 from tqdm import tqdm
 import voyageai
 import boto3

sweepai/core/vector_db.py

Modify the `openai_with_expo_backoff` function to add a timeout to the Redis query and wrap it with the backoff decorator.
--- 
+++ 
@@ -1,11 +1,21 @@
     cache_keys = [hash_sha256(text) + CACHE_VERSION for text in batch]
-    try:
-        for i, cache_value in enumerate(redis_client.mget(cache_keys)):
-            if cache_value:
-                embeddings[i] = np.array(json.loads(cache_value))
-    except Exception as e:
-        logger.exception(e)
-    # not stored in cache call openai
+
+    @backoff.on_exception(backoff.expo, TimeoutError, max_tries=5)
+    def get_cached_embeddings():
+        try:
+            cache_values = redis_client.mget(cache_keys, timeout=5)
+            for i, cache_value in enumerate(cache_values):
+                if cache_value:
+                    embeddings[i] = np.array(json.loads(cache_value))
+        except TimeoutError:
+            logger.warning("Redis query timed out, retrying...")
+            raise
+        except Exception as e:
+            logger.exception(e)
+
+    get_cached_embeddings()
+
+    # not stored in cache, call openai
     batch = [
         text for i, text in enumerate(batch) if embeddings[i] is None
     ]  # remove all the cached values from the batch

Step 3: 🔄️ Validating

Your changes have been successfully made to the branch sweep/sweep_connection_closed_by_server_redis. I have validated these changes using a syntax checker and a linter.


[!TIP] To recreate the pull request, edit the issue title or description.

This is an automated message generated by Sweep AI.