Using FastAPI for Crawl4AI in a production environment, handling up to 50 concurrent requests.

Hello and thank you for building this amazing libary.

i'm using crawl4ai in a production environment with up to 50 concurrents requests in a fastapi Application. the problem i have is the memory usage, im building using docker and this is my docker file :

FROM python:3.12-slim

WORKDIR /workspace
ENV HOME=/workspace

ADD . /workspace

RUN pip install -r requirements.txt

RUN playwright install chromium
RUN playwright install-deps

EXPOSE 8585

CMD ["gunicorn", "main:app", \
     "--workers", "8", \
     "--worker-class", "uvicorn.workers.UvicornWorker", \
     "--bind", "0.0.0.0:8585", \
     "--timeout", "120", \
     "--keep-alive", "5", \
     "--max-requests", "500", \
     "--max-requests-jitter", "50", \
     "--log-level", "info", \
     "--access-logfile", "-"]

i tried two methods for handling crawl4ai, one using fastApi lifespan where i create a global crawler:

# Global AsyncWebCrawler instance
crawler = None

@asynccontextmanager
async def lifespan(app_start: FastAPI):
    # Startup: create and initialize the AsyncWebCrawler
    global crawler
    crawler = AsyncWebCrawler(verbose=False, always_by_pass_cache=True)
    await crawler.__aenter__()
    yield  
    if crawler:
        await crawler.__aexit__(None, None, None)

app = FastAPI(lifespan=lifespan)

scraping_semaphore = asyncio.Semaphore(10)

With this approach, memory usage keeps increasing indefinitely, requiring a server reboot every three days to keep it running smoothly, even with a Semaphore set to 10.

Alternatively, I’ve tried using the crawler without a global instance. With this approach, I experience memory spikes, but they eventually return to normal. Additionally, with 10 concurrent requests running on a server with 4 vCPUs and 16 GB of RAM, the response time averages around 20 seconds.

@app.post("/crawl_urls")
async def crawl_urls(request: ScrapeRequest):
    try:
        #print(f"Received {request.urls} urls to scrape")
        if not request.urls:
            return []
        tasks = [process_url(url) for url in request.urls]
        results = await asyncio.gather(*tasks)
        return results
    except Exception as e:
        #print(f"Error in scrape_urls: {e}")
        return []

async def process_url(url):
    try:
        if await is_pdf(url):
            return ''
        #start_time = time.time()
        result = await crawl_url(url)
        return result

    except Exception as e:
        #print(f"Error processing {url}: {e}")
        return ''

async def crawl_url(url):
    try:
        async with AsyncWebCrawler(verbose=False,always_by_pass_cache=True) as crawler:
            result = await crawler.arun(url=url, verbose=False,bypass_cache=True)
            #print(result.markdown)
            return result.markdown
    except Exception as e:
        print(f"error in crawl4ai {e}")
        return ''

# im bypassing the cache to test for concurrents requests

I’m not sure if there are specific settings I can adjust to improve performance and reduce memory usage. Any advice on optimizing this setup would be greatly appreciated.

P.S.: I also tried using arun_many, but it didn’t result in any performance improvement.

unclecode / crawl4ai

Using FastAPI for Crawl4AI in a production environment, handling up to 50 concurrent requests. #188