littleK0i / SnowDDL

Declarative database change management tool for Snowflake
Apache License 2.0
105 stars 28 forks source link

View with column definition including "<expr> AS <alias>" (in SELECT) not parsed properly by converter #147

Open JohnL4 opened 1 week ago

JohnL4 commented 1 week ago

Describe the bug We have a view whose definition looks something like the following:

CREATE VIEW <schema>.<view> AS
SELECT DISTINCT
  MAX(RPT_DTTM) AS FILTER_RPT_DTTM
  ,'yes' as CRRNT_RPT_DTTM
FROM <tablename>
WHERE <expr>
UNION
<more horribleness>

I believe the regular expression parsing this view is greedy and treats the second AS keyword as the beginning of the view definition, thus eliding SELECT and everything that comes immediately after.

I suspect the problem is in the greediness of this regexp: https://github.com/littleK0i/SnowDDL/blame/c7ca3c48a533340bc88ac690b6664840005093ec/snowddl/converter/view.py#L11

I'm still fiddling around, but my suspicions are pretty strong at this point.

Expected behavior Full text of the view definition appears in the yaml generated by snowddl-convert.

Attach log (Later, if necessary.)

Attach YAML config (if applicable) (No minimal reproduction at this time. Maybe later.)

littleK0i commented 1 week ago

Wish we could find a way to extract VIEW text without relying on parsing. Maybe new REST API endpoints can do that, I'll take a look at some point.

Otherwise the only way to do it reliably is to introduce actual SQL parsing.

JohnL4 commented 1 week ago

Yeah, I hear you. Was thinking of that StackOverflow post on parsing HTML with regexps. (And, yes, parsing SQL is Hard.)

JohnL4 commented 1 week ago

Ok, so, for what it's worth, making the following change to the regexp solved my one particular issue for this one particular view:

view_text_re = compile(r"^.*?\sas(\n|\s)+(.*)$", DOTALL)

(Changed * operator to *?.)

No telling what other damage I've caused. :)

littleK0i commented 6 days ago

Upd in 0.36.0: applied your change, regexp looking for as keyword is now non-greedy.

Naturally, it may still fail if as is being used somewhere in the comments for individual columns or view itself.

I'll experiment with pyparsing soon and see if we can introduce more permanent solution.

JohnL4 commented 4 days ago

Fwiw, pyparsing docs:

ignore(expr) - function to specify parse expression to be ignored while matching defined patterns; can be called repeatedly to specify multiple expressions; useful to specify patterns of comment syntax, for example