GabrielEValenzuela / chatML

A web API exposing a neural network to detect duplicate entities in knowledge graphs. It uses API key authentication and rate limits requests based on client tiers (FREEMIUM, PREMIUM)
MIT License
0 stars 0 forks source link

Implement request rate limiting based on account type #9

Open GabrielEValenzuela opened 4 weeks ago

GabrielEValenzuela commented 4 weeks ago


Develop a rate-limiting mechanism to control the number of requests each user can make based on their account type. FREEMIUM accounts should be limited to 5 requests per minute (RPM), while PREMIUM accounts are allowed up to 50 RPM. If the limit is exceeded, the system should reject additional requests and respond with an HTTP 429 Too Many Requests status until the rate limit resets.

User Stories


Example Usage and Responses

Implementation Steps

  1. Configure Rate Limiting Rules:

    • Set FREEMIUM limit to 5 RPM and PREMIUM limit to 50 RPM.
    • Store the limits in a configuration file or environment variable for easy adjustment.
  2. Implement Rate Limiter Using Redis:

    • Use Redisto store user request counts and timestamps, expiring entries after one minute to reset the count.
    • For each request, check the user’s current count in Redis.
      • If the count is below the limit, increment it and allow the request.
      • If the count meets the limit, return a 429 Too Many Requests response.
  3. Apply Rate Limiting Middleware:

    • Implement middleware or dependency in FastAPI to enforce rate limiting on all authenticated endpoints, specifically /predict-similarity.
    • Use dependency injection to check user account type (FREEMIUM or PREMIUM) and enforce the corresponding rate limit.
  4. Handle Responses and Errors:

    • If a request is blocked due to rate limits, respond with:
      • HTTP 429 Too Many Requests
      • JSON message explaining the limit has been reached and advising the user to wait.

Code Mockup

Here’s a simplified example using Redis to track requests.

from fastapi import APIRouter, HTTPException, Depends
from redis import Redis
from datetime import datetime, timedelta

# Redis configuration
redis_client = Redis(host='localhost', port=6379, db=0)

# Rate limits
    "FREEMIUM": 5,
    "PREMIUM": 50

# Rate limit function
def rate_limit(user_id: str, account_type: str):
    key = f"rate_limit:{user_id}"
    requests = redis_client.get(key)

    if requests is None:
        # Set initial count if not already present
        redis_client.setex(key, timedelta(minutes=1), 1)
    elif int(requests) < RATE_LIMITS[account_type]:
        # Increment request count
        # Limit exceeded
        raise HTTPException(
            status_code=429, detail="Rate limit exceeded. Please wait before making additional requests."

# Example usage in endpoint
async def predict_similarity_endpoint(user_id: str, account_type: str):
    rate_limit(user_id, account_type)
    # Logic for similarity prediction here
    return {"message": "Prediction result"}

Edge Cases