voltrondata / sqlflite

An example Flight SQL Server implementation - with DuckDB and SQLite back-ends.
Apache License 2.0
189 stars 21 forks source link

Memory Handling Issue in DuckDB flight-sql-server #39

Open rohitprg opened 2 months ago

rohitprg commented 2 months ago

DuckDB flight-sql-server appears to have an issue with its memory management, as it does not seem to reset its memory usage between each query, causing a continuous increase in memory consumption.

This issue surfaced during performance benchmarking when reading parquet files from Azure blob storage was the primary function. The benchmarking was performed using a Python script which utilized the Locust framework for load testing. This script replicated user behavior by running a sequence of SQL queries and noting the response time for each.

The problem lies in DuckDB flight-sql-server apparent failure to clear its memory usage between these queries, resulting in a steady build-up of memory usage with each query executed.

The series of tests started with 10 users and gradually escalated to 300 concurrent users. Throughout these tests, an Out of Memory error was consistently encountered. This persistent increase in memory usage indicates that DuckDB flight-sql-server is not resetting its memory use between queries. Even after a prolonged period of inactivity, exceeding eight hours, with no load being sent, the memory usage remained the same and did not decrease.

image


import os
import random
import logging
from time import time
from locust import TaskSet, task, between
from locust import HttpUser, events
from adbc_driver_flightsql import dbapi as flight_sql, DatabaseOptions

class FlightUserBehavior(TaskSet):

    # Get parameters from environment variables
    flight_password = os.environ['FLIGHT_PASSWORD']
    flight_username = os.environ['FLIGHT_USERNAME']

    def on_start(self):
        logging.debug('inside on start')
        self.uri = "grpc+tcp://duckdbflightsqlserver:31337"
        self.db_kwargs = {
            "username": self.flight_username,
            "password": self.flight_password,
            DatabaseOptions.TLS_SKIP_VERIFY.value: "false"  
        }
        # Close the old connection if there is one
        if hasattr(self, "conn"):
            self.conn.close()
        # Create a new connection
        self.conn = flight_sql.connect(uri=self.uri, db_kwargs=self.db_kwargs)

    def on_stop(self):
        self.conn.close()

    @task
    def execute_queries(self):

        queries = [

           # Queries for Reading Parquet Files from Azure Blob Storage

        ]

        # Randomize the order
        random.shuffle(queries)

        for query in queries:
            cur = self.conn.cursor()
            start_time = time()
            try:
                cur.execute(query)
                total_time = int((time() - start_time) * 1000)  # get time in milliseconds
                events.request.fire(request_type="Execute", name=query, response_time=total_time, response_length=0,
                                    context={}, exception=None)
            except Exception as e:
                total_time = int((time() - start_time) * 1000)  # get time in milliseconds
                logging.error(f'Failed to execute query: {e}')  # Log the error with stack trace
                events.request.fire(request_type="Execute", name=query, response_time=total_time, response_length=0,
                                    context={}, exception=e)
            finally:
                cur.close()

class FlightUser(HttpUser):
    tasks = [FlightUserBehavior]
    wait_time = between(1, 2)
prmoore77 commented 2 months ago

hi @rohitprg - I've observed this with DuckDB as well (I think). Could you retry your test with just DuckDB (no Flight SQL) to see if the behavior is similar?

rohitprg commented 2 months ago

Thanks @prmoore77 - I will perform my tests on DuckDB and update you with the results. Do you know of any potential solutions to this issue?