druid-io / pydruid

A Python connector for Druid
Other
506 stars 194 forks source link

[db-api] Performance improvements #144

Closed john-bodley closed 5 years ago

john-bodley commented 5 years ago

This PR improves the performance of fetchall and fetchmany by iterating on the result set rather than iterating over the cursor's iterator. The following,

import timeit

from pydruid.db.api import Cursor

def execute(self, operation, parameters=None):
    self._results = iter(range(10000))

Cursor.execute = execute

def test():
    cursor = Cursor(url=None)
    cursor.execute(operation=None)
    cursor.fetchall()

print(timeit.timeit("test()", number=1000, setup="from __main__ import test"))

resulted in an execution time of 6.698s and 0.331s for the current and proposed solutions respectively, i.e., the proposed solution is about 20x faster.

to: @betodealmeida @mistercrunch

mistercrunch commented 5 years ago

tenor