entechlog / dbt-snow-mask

This repository contains source code for dbt package dbt_snow_mask.
https://hub.getdbt.com/entechlog/dbt_snow_mask/latest/
GNU General Public License v3.0
60 stars 25 forks source link

Overview

This dbt package contains macros that can be (re)used across dbt projects with snowflake. dbt_snow_mask will help to apply Dynamic Data Masking using dbt meta.

Installation Instructions

✅ Packages can be added to your project using either of above options
✅ Please refer to the release version of this repo/dbt hub for the latest revision

✅ Please refer to the release version in dbt hub for the latest revision

How to configure database and schema for the masking policy ?

By default this process creates the masking policies in same directory as the database objects. There are 2 methods for changing the default behavior by using the following parameters in your dbt_project.yml

Method 1 : Use a common database

To change the database that your masking polices are created in set the following parameters:

Example : var block in dbt_project.yml to enable using a common masking policy database

vars:
  use_common_masking_policy_db: "True"
  common_masking_policy_db: "DEMO_DB"
  common_masking_policy_schema: "COMPLIANCE"

Method 2 : Use a common schema (in the current database)

To change only the schema (so that a common masking policy schema is used in the same database as your model) set the following parameters:

Example : var block in dbt_project.yml to enable using a common masking policy schema (in the current database)

vars:
  use_common_masking_policy_schema_only: "True"
  common_masking_policy_schema: "COMPLIANCE"

✅ If both use_common_masking_policy_db and use_common_masking_policy_schema_only are set to True, then use_common_masking_policy_db will supercede use_common_masking_policy_schema_only.

Allow Custom Materializations

To enable dbt_snow_mask to apply masking policies to models generated from custom materializations in dbt, configure the following parameter:

Example : var block in dbt_project.yml to enable application of masking policies to a model generated using a custom materialiazition that ends up as a table in Snowflake.

vars:
  custom_materializations_map: '{ "custom_incremental": "table" }'

How to apply masking policy ?

vars:
  use_force_applying_masking_policy: "True"

Its good to keep the masking policy ddl organized in a directory say \macros\snow-mask-ddl

Resource Type Command
sources dbt run-operation create_masking_policy --args '{"resource_type": "sources"}'
models dbt run-operation create_masking_policy --args '{"resource_type": "models"}'
Resource Type Command
sources dbt run-operation apply_masking_policy --args '{"resource_type": "sources"}'
models dbt run -- model <model-name>
snapshots dbt snapshot --select <snapshot-name> --target <target-name>

(For models)

  models:
    post-hook: 
      - "{{ dbt_snow_mask.apply_masking_policy('models') }}"

(For snapshots)

  snapshots:
    post-hook: 
      - "{{ dbt_snow_mask.apply_masking_policy('snapshots') }}"

How to remove masking policy ?

Resource Type Command
sources dbt run-operation unapply_masking_policy --args '{"resource_type": "sources"}'
models dbt run-operation unapply_masking_policy --args '{"resource_type": "models"}'
snapshots dbt run-operation unapply_masking_policy --args '{"resource_type": "snapshots "}'

(For models)

  models:
    post-hook: 
      - "{{ dbt_snow_mask.unapply_masking_policy('models') }}"

(For snapshots)

  snapshots:
    post-hook: 
      - "{{ dbt_snow_mask.unapply_masking_policy('snapshots') }}"

How to validate masking policy ?

-- Show masking policy
SHOW MASKING POLICIES;

-- Describe masking policy
DESCRIBE MASKING POLICY <masking-policy-name>;

-- Show masking policy references
USE DATABASE <database-name>;

USE SCHEMA INFORMATION_SCHEMA;

SELECT *
  FROM TABLE(INFORMATION_SCHEMA.POLICY_REFERENCES(POLICY_NAME => '<database-name>.<schema-name>.<masking-policy-name>'));

Process flow

Create masking policy

graph TD
    A[create_masking_policy] --> |resource_type='sources',meta_key='masking_policy'| B[get_masking_policy_list_for_sources]
    A[create_masking_policy] --> |resource_type='models',meta_key='masking_policy'| C[get_masking_policy_list_for_models]
    B --> |database, schema| D[create_schema]
    C --> |database, schema| D[create_schema]
    D --> |policy_name| E[call_masking_policy_macro]

Apply masking policy

graph TD
    A[apply_masking_policy] --> |resource_type='sources',meta_key='masking_policy'| B[apply_masking_policy_list_for_sources]
    A[apply_masking_policy] --> |resource_type='models',meta_key='masking_policy'| C[apply_masking_policy_list_for_models]
    B --> |meta_key| D[confirm masking policy is avaliable in db]
    C --> |meta_key| D[confirm masking policy is avaliable in db]
    D --> E[alter statement to set masking policy]

Known Errors and Solutions

Error Solution
dict object' has no attribute 'create_masking_policy_mp_encrypt_pi' Typo in yaml for masking_policy, mp_encrypt_pi instead of mp_encrypt_pii

Credits

This package was created using examples from Serge and Matt. Please see the contributors for full list of users who have contributed to this project.

References

Contributions

Contributions to this package are welcomed. Please create issues for bugs or feature requests for enhancement ideas or PRs for any enhancement contributions.

How to do an integration test ?