The open-source Data Security Engine.
Learn more »
Slack
·
Website
·
Issues
·
Docs
·
Roadmap
·
Book a meeting
PACE is the Policy As Code Engine. It helps you to programmatically create and apply a data policy to a processing platform (like Databricks, Snowflake or BigQuery). Through a data contract, you can apply filters, field transforms, tag-based conditions and access settings to create a view inside a data platform. With Pace, you can enforce policies to data to ensure that data is only used by those allowed and in the way it was intended to be used.
Follow the quickstart if you want to dive right in, join on Slack to discuss with us, and use issues and PR's if you want to contribute or miss a feature!
PACE is designed to remove friction and cost from using data in real-world organisational settings. In other words: define and implement a policy to "just build" with data, instead of jumping through hoop after hoop.
If (one of) these sound familiar and you're using one of the currently supported platforms, PACE is worth a try:
Once installed, PACE sits between your data definitions (often a catalog) and processing platform. The deep dive below provides more background.
Pace currently supports Collibra, Datahub and Open Data Discovery on the catalog side, connecting to Snowflake, Databricks, Google BigQuery, and PostgreSQL for creating your dynamic views.
It's early for PACE (we're in alpha). The following policy methods are currently available, and when put together form "rule sets", the basis of data policies in PACE:
email
, or "nullify the phone number
", including access definitions to differentiate between data consumers.something
if data is tagged Greece
".PII
should always be masked".These policy methods can be layered to create a powerful programmatic interface to define, implement, maintain and update policies. Create an issue if you think a valuable policy method is missing!
To install and use PACE, you need:
Head over to the docs for more info, join on Slack to discuss or reach out to the STRM team for more info and/or to test and implement PACE together.
PACE is built to connect the world of descriptive data tools to the actual data processing platforms (where all that data stuff takes place!).
It's designed to make sure your data governance can follow this pattern:
Various data consumers (1), should only be shown a representation of data (2) that is tailored to who they are and what they're allowed to see (3), regardless of the data catalog (4) in which they explore and find data, and regardless of the data processing platform (5) on which they consume the data.
To solve this, PACE focuses on creating representations of data (e.g. by generating views), based on so-called Data Policies.
A Data Policy is a structured (human-defined but machine-readable) document, that aims at capturing how source data should be presented to various principals (i.e. a data accessor), and which transformations should be applied to the data, to create a representation of the source data on the data processing platform.
Data Policies are constructed by retrieving the data schema (the structure of the data) from either a data catalog or a data processing platform. Next, various rule sets can be created, that determine how source data is transformed and/or filtered to create a representation of data on the processing platform. Defining rule sets is a cooperation between various teams: data consumers, data producers, and most important, the legal team.
Want to learn more about how to facilitate this cooperation between various teams? Navigate to https://pace.getstrm.com to see how we can help you!
Looking for a visual of PACE to include somewhere (preferably the blog you're writing about that awesome PACE use case!)? Please use one of the variations of the PACE logo to match your brand style and include the link to this repo: