airbytehq / PyAirbyte

PyAirbyte brings the power of Airbyte to every Python developer.
https://docs.airbyte.com/pyairbyte
Other
176 stars 20 forks source link

Feat: Handle special chars in column and table name normalization #239

Closed aaronsteers closed 1 month ago

aaronsteers commented 1 month ago

This addresses:

What this PR does, as documented in the code:

Please see the unit test for a description of the logic and examples.

aaronsteers commented 1 month ago

@edgao - I've updated the code to reflect the logic you shared in our slack thread. Do you mind reviewing the latest and approve if everything basically matches your expectations?

(Ignore stale PR description quotes.)

aaronsteers commented 1 month ago

There was a very sticky bug that surfaced and delayed this merging.

Here's the finding... we previously were running an extra "normalize()" operation on Snowflake, when evaluating $1."column_name", which is snowflake-speak for getting "column_name" from a variant column. By running an unnecessary normalization, we were inadvertantly normalizing at the wrong place, and converting the expression to $1._column_name_, which was incorrect and returning null values all around. (Previously this had no effect, because the quote characters were being left alone in the prior iteration of the normalize code.

This 👆 is fixed. So I'll merge this PR shortly.