entechlog / dbt-snow-mask

This repository contains source code for dbt package dbt_snow_mask.
https://hub.getdbt.com/entechlog/dbt_snow_mask/latest/
GNU General Public License v3.0
60 stars 25 forks source link

Set masking_policy_list should be outside of the loop #29

Closed IL-William closed 2 years ago

IL-William commented 2 years ago

In apply_masking_policy_list_for_sources.sql and apply_masking_policy_list_for_models.sql the variable masking_policy_list is set for each column in the model. So that means that if you got 30 columns masked, it will run 30 times the same query. The reason is that you don't want to execute this query if none masking policies are set for the model. But I think it should be run 1 time in the macro, so the variable masking_policy_list should be a few rows above.

        {% set masking_policy_list_sql %}     
            show masking policies in {{masking_policy_db}}.{{masking_policy_schema}};
            select $3||'.'||$4||'.'||$2 as masking_policy from table(result_scan(last_query_id()));
        {% endset %}

--  {% set masking_policy_list = dbt_utils.get_query_results_as_dict(masking_policy_list_sql) %}  👈 here instead

        {%- for meta_tuple in meta_columns if meta_columns | length > 0 %}
            {% set column   = meta_tuple[0] %}
            {% set masking_policy_name  = meta_tuple[1] %}
                {% if masking_policy_name is not none %}

                {% set masking_policy_list = dbt_utils.get_query_results_as_dict(masking_policy_list_sql) %} -- should be above 👆

In the screenshot below, you'll see a part of an example, where I have a table with 14 columns, so it runs an unnecessary amount of queries against the database. Screenshot 2022-05-10 at 14 48 11

What do you think?

entechlog commented 2 years ago

Hello @IL-William , I will check this and get back.

IL-William commented 2 years ago

Hey @entechlog Sorry I didn't have a chance to look at this before! Coming back on this, I think it could be a simple solution like that.

{# If there are some masking policies to be applied in this model, we should show the masking policies in the schema. Otherwise we don't need it. #}
        {% if meta_columns | length > 0 %}
            {% set masking_policy_list = dbt_utils.get_query_results_as_dict(masking_policy_list_sql) %}
        {% endif %}
Screenshot 2022-05-23 at 16 20 42
entechlog commented 2 years ago

@IL-William , Thanks for the suggestion. I have created release candidate https://github.com/entechlog/dbt-snow-mask/releases/tag/0.2.0rc1. Can you please test this and see if things are still looking good after the loop fix ?

IL-William commented 2 years ago

@entechlog Thanks for the fix! It's already what I was using on my side, using the package locally, so I can say it's working as expected indeed 🙂

entechlog commented 2 years ago

Thank You @IL-William . I will do the release in a day or so from prerelease

nadesansiva commented 2 years ago

@IL-William, I have just released a new version 0.2.0 and should be reflected in dbt hub in about an hour OR so. Closing the issue now.