astronomer / astro-sdk

Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.
https://astro-sdk-python.rtfd.io/
Apache License 2.0
342 stars 42 forks source link

WIP: Fix setting Snowflake query tags #2187

Open tatiana opened 1 week ago

tatiana commented 1 week ago

This PR attempts to run the pre- and post-queries as part of the same Snowflake session to address an issue the customer reported when trying to set query tags.

Problem

A customer reported that they could not set query tags in Snowflake when using an Astro SDK 1.7 alpha release ( https://astronomer.zendesk.com/agent/tickets/62628).

They didn't get any errors while running the operator with session_modifier.pre_queries = ["ALTER SESSION SET QUERY_TAG=MyQueryTag"], but they couldn't see the desired queries being annotated with the expected query tags:

It is assumed that the param query_modifier is used in the decorator and specifies the alter session query tag as pre_queries to send that query tag before any query is launched by the function. Still, it happens to create two different sessions, and the queries launched by the function don't have the tag.

Context

The way the feature was implemented was to attempt to leverage query tags per session, as described in: https://www.chaosgenius.io/blog/snowflake-query-tags/

The relevant code in Astro SDK (added in #1898 #1962) is: https://github.com/astronomer/astro-sdk/blob/33ca6758f8d4052faba21e4579f358cda232dc98/python-sdk/src/astro/databases/base.py#L162-L168 https://github.com/astronomer/astro-sdk/blob/33ca6758f8d4052faba21e4579f358cda232dc98/python-sdk/src/astro/databases/base.py#L106C9-L128

As can be seen, the ALTER query statement is being run each time with a different connection/session: https://github.com/astronomer/astro-sdk/blob/33ca6758f8d4052faba21e4579f358cda232dc98/python-sdk/src/astro/databases/base.py#L96-L99

The base method is being used since Snowflake does not override the property connection: https://github.com/astronomer/astro-sdk/blob/main/python-sdk/src/astro/databases/snowflake.py

What is missing in this PR

Still need to confirm this fix works in practice.

Create an integration test:

SELECT user_name, role_name, query_tag FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY WHERE query_tag = 'MyQueryTag'