andialbrecht / sqlparse

A non-validating SQL parser module for Python
BSD 3-Clause "New" or "Revised" License
3.76k stars 700 forks source link

Compute TokenList.value dynamically (v2) #710

Open living180 opened 1 year ago

living180 commented 1 year ago

This PR supersedes #623. The meat of the PR is the same: fix the remaining portion of issue #621 by making TokenList.value a dynamically-computed property rather than an attribute. This avoids the quadratic runtime behavior that occurred due to recomputing TokenList.value each time TokenList.group_tokens() was called with extend=True.

The previous PR #623 had some rather awkward hacks related to stripping comments, but I found that I could avoid those by simply tweaking the comment stripping process to strip comments from a token list before stripping any sublists, making this PR much simpler.

codecov[bot] commented 1 year ago

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.01 :tada:

Comparison is base (fc76056) 96.95% compared to head (d2ab15c) 96.97%.

:exclamation: Current head d2ab15c differs from pull request most recent head ff4f391. Consider uploading reports for the commit ff4f391 to get more accurate results

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #710 +/- ## ========================================== + Coverage 96.95% 96.97% +0.01% ========================================== Files 20 20 Lines 1545 1555 +10 ========================================== + Hits 1498 1508 +10 Misses 47 47 ``` | [Impacted Files](https://codecov.io/gh/andialbrecht/sqlparse/pull/710?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Andi+Albrecht) | Coverage Δ | | |---|---|---| | [sqlparse/filters/others.py](https://codecov.io/gh/andialbrecht/sqlparse/pull/710?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Andi+Albrecht#diff-c3FscGFyc2UvZmlsdGVycy9vdGhlcnMucHk=) | `98.79% <100.00%> (ø)` | | | [sqlparse/sql.py](https://codecov.io/gh/andialbrecht/sqlparse/pull/710?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Andi+Albrecht#diff-c3FscGFyc2Uvc3FsLnB5) | `97.68% <100.00%> (+0.06%)` | :arrow_up: | Help us with your feedback. Take ten seconds to tell us [how you rate us](https://about.codecov.io/nps?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Andi+Albrecht). Have a feature suggestion? [Share it here.](https://app.codecov.io/gh/feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Andi+Albrecht)

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

sdether commented 1 year ago

This is just the fix I need. Ran into some problems with parsing SQL with ~50k ID IN clauses, which with 0.4.4 takes a bit over 6 minutes to parse and with this patch only takes 6 seconds !!!

rumbin commented 1 year ago

any progress here?