andialbrecht / sqlparse

A non-validating SQL parser module for Python
BSD 3-Clause "New" or "Revised" License
3.73k stars 695 forks source link

apparent N^2 reindenting of lists #592

Open justinvanwinkle opened 3 years ago

justinvanwinkle commented 3 years ago

reindenting queries with lists can take a very long time

In [11]: %timeit sqlparse.format('SELECT foo FROM foo WHERE foo IN (' + ','.join(['1'] * 3) + ')', reindent=True)
799 µs ± 666 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [12]: %timeit sqlparse.format('SELECT foo FROM foo WHERE foo IN (' + ','.join(['1'] * 10) + ')', reindent=True)
1.23 ms ± 2.63 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [13]: %timeit sqlparse.format('SELECT foo FROM foo WHERE foo IN (' + ','.join(['1'] * 100) + ')', reindent=True)
7.26 ms ± 16.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [14]: %timeit sqlparse.format('SELECT foo FROM foo WHERE foo IN (' + ','.join(['1'] * 1000) + ')', reindent=True)
170 ms ± 3.75 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [15]: %timeit sqlparse.format('SELECT foo FROM foo WHERE foo IN (' + ','.join(['1'] * 2000) + ')', reindent=True)
547 ms ± 14 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [16]: %timeit sqlparse.format('SELECT foo FROM foo WHERE foo IN (' + ','.join(['1'] * 3000) + ')', reindent=True)
1.2 s ± 41.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [17]: %timeit sqlparse.format('SELECT foo FROM foo WHERE foo IN (' + ','.join(['1'] * 4000) + ')', reindent=True)
1.95 s ± 4.15 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [18]: %timeit sqlparse.format('SELECT foo FROM foo WHERE foo IN (' + ','.join(['1'] * 5000) + ')', reindent=True)
2.97 s ± 3.89 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [19]: %timeit sqlparse.format('SELECT foo FROM foo WHERE foo IN (' + ','.join(['1'] * 6000) + ')', reindent=True)
4.33 s ± 135 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [20]: %timeit sqlparse.format('SELECT foo FROM foo WHERE foo IN (' + ','.join(['1'] * 7000) + ')', reindent=True)
5.78 s ± 135 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
AndrewGrossman commented 3 years ago

👍 Also encountering this problem