ethyca / fides

The Privacy Engineering & Compliance Framework
https://ethyca.com/docs
Apache License 2.0
359 stars 72 forks source link

Add a dry-run/ci check flag to evaluate the validity of the current manifest files #14

Closed ThomasLaPiana closed 3 years ago

ThomasLaPiana commented 3 years ago

One of the core components of Fides is the ability for it to run as part of CI and the greater "SDLC". For this to be possible, Fides needs to be aware of when its running on main/master and when its running in a branch. My proposed solution for this is to add a new field to all database objects called branch or something similar, and have Fidesctl inject the name of the git branch into that field when sending objects to the server and also filtering for that field when querying the server for objects.

ThomasLaPiana commented 3 years ago

Notes from standup with Steven:

  1. The main value of the feature here is for someone to check the current state of their manifest files against the server and know if they're valid or not
  2. This doesn't need to be tracked by the server, git + ci checks will show the manifest files going in and out of a valid state. This is a completely ephemeral and non-destructive command
  3. Should look something like fidesctl evaluate --dry-run <manifests_dir>

@stevenbenjamin this feels pretty closely tied to the workflow things we want to solve, and we're in agreement there should only be one command that can be used with or without a dry-run flag. feel free to put your workflow thoughts here or in another issue

stevenbenjamin commented 3 years ago

After some additional thought I believe the server should separate out different kinds of rating operations, and separate these from basic CRUD operations (i.e. create/update/delete a system, etc etc). We'd thought about having a system POST return an approval, but I think this makes the response kind of confusing , since every other CRUD endpoint should return the object in question.

Ultimately came up with this:

For consistency all rating actions will happen under /endpoint/rate. /system/rate/:id Create and return the approval for an existing system

/system/rate/dry-run Create and return the approval for the posted system as a dry-run. In this case the posted system is not stored

(proposed) /system/rate/:id/current Last seen rating for this system (i.e. the current known state)

/registry/rate/:id Create and return the approval record for an existing registry

/registry/rate/dry-run Create and return the approval for the posted registry as a dry-run. In this case the posted registry is not stored

(proposed) /registry/rate/:id/current Last seen rating for this registry (i.e. the current known state)

ThomasLaPiana commented 3 years ago

@stevenbenjamin A few thoughts here

  1. I'd prefer to use fidesKey instead of the id for lookups, otherwise it always adds another step of the CLI needing to look up the object first just to get its id
  2. for these endpoints, are you envisioning all of the manifests being sent for a dryrun or just the system? if you do a dryrun and only submit the new system, what about the changes made to other parts of the manifests? i might want to make a policy change in an MR that would affect how a system behaves, or add a new dataUse, etc. and these would all need to be sent together to evaluate an object in its entirety
  3. the rate term is being deprecated in favor or evaluate, rate evokes a numerical scoring or a spectrum, whereas we just have a pass/fail
stevenbenjamin commented 3 years ago
  1. i understand the convenience of this, but it does get into consistency issues. How do we make this consistent with the rest of the API? shouldn't there be some way of assigning/storing the api values with the manifest?

  2. There are no other manifests other than system manifests. Policy changes need to be submitted separately, since a policy is organization wide, it needs to be re-applied to everything when you update it. this is not workable if you're constantly re-submitting. Ditto for resubmitting a piece of the taxonomy.
    The model of "systems, rules, taxonomy types, etc" all stored in one place and continually reapplied is broken and will not work with the server as written. the server is built on the model of an in-place set of rules/taxonomy values that systems/registries are evaluated against. Your model will break completely if commits are coming from multiple places.

  3. Naming change is not a problem.