Open elahea2020 opened 1 month ago
Hey @elahea2020! I'm here to help you with any bugs, questions, or contributions. If you need assistance, feel free to ask!
The warning you're encountering is related to the retry mechanism in the llama_index.embeddings.openai.base.get_embeddings
method. This method uses a retry and backoff mechanism to handle transient connection errors. Specifically, it retries up to 6 times with a random exponential backoff, stopping after a delay of 60 seconds, with a minimum delay of 4 seconds and a maximum delay of 10 seconds [1][2].
To mitigate this issue, you can implement a more robust retry mechanism with exponential backoff in your script. Here's an example of how you can modify your script to include retries with exponential backoff:
import requests
from requests.exceptions import RequestException
import time
import logging
logger = logging.getLogger(__name__)
def rate_limit(*, allowed_per_second: int):
max_period = 1.0 / allowed_per_second
last_call = [time.perf_counter()]
lock = threading.Lock()
def decorate(func):
@wraps(func)
def limit(*args, **kwargs):
with lock:
elapsed = time.perf_counter() - last_call[0]
hold = max_period - elapsed
if hold > 0:
time.sleep(hold)
result = func(*args, **kwargs)
last_call[0] = time.perf_counter()
return result
return limit
return decorate
@rate_limit(allowed_per_second=15)
def rate_limited_get(url, headers, retries=5, backoff_factor=0.3):
"""
Make a GET request with rate limiting and retry with exponential backoff.
"""
for attempt in range(retries):
try:
resp = requests.get(url, headers=headers)
if resp.status_code == 429:
logger.warning("Rate limited, sleeping for 5 minutes")
time.sleep(300)
return rate_limited_get(url, headers)
resp.raise_for_status()
return resp
except RequestException as e:
logger.warning(f"Request failed: {e}, attempt {attempt + 1} of {retries}")
if attempt < retries - 1:
sleep_time = backoff_factor * (2 ** attempt)
logger.info(f"Retrying in {sleep_time} seconds...")
time.sleep(sleep_time)
else:
logger.error("Max retries reached. Raising exception.")
raise
Additionally, you can enable detailed logging to gain more insights into the cause of the intermittent connection errors. Here's how you can set up logging in your application:
import logging
import sys
logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))
For more advanced monitoring, you can use callback handlers provided by LlamaIndex, such as LlamaDebugHandler
or LangfuseCallbackHandler
, to track events and traces [3][4][5].
These steps should help you handle the intermittent APIConnectionError
warnings more effectively.
To continue talking to Dosu, mention @dosu.
Bug Description
I am running the script below as part of a bigger project and every once in a while I get this Warning
WARNING:llama_index.embeddings.openai.utils:Retrying llama_index.embeddings.openai.base.get_embeddings in 0.7360705661705746 seconds as it raised APIConnectionError: Connection error..
It happens after around 2k calls but it's not happening consistently. I would appreciate your help with this.