danielberkompas / cloak

Elixir encryption library designed for Ecto
MIT License
582 stars 55 forks source link

Modifying cloak.migrate to re-encrypt data when migrating #61

Closed snarkitect closed 6 years ago

snarkitect commented 6 years ago

Intro

Hi Daniel,

First off, my team and I love what you've built with cloak. I'm personally new to elixir, and this is my first project, so I may have be misunderstanding something. It's totally possible, even likely, that I've missed something and these changes are unnecessary. Also, we're right across the border in Portland, so maybe we'll see each other at a meet-up or ElixirConf this year.

We're using Cloak's Ecto types to seamlessly decrypt and encrypt data when we read and write to a postgres database. That all works great, but I ran into a snag when using the cloak.migrate mix task. It seams to only update the cloak version stored alongside the records. It may be that I configured cloak migrations improperly, but I set about attempting to improve that task in the interest of contributing to Cloak.

Note: there have been some updates to cloak.migrate.ex since I wrote this, but I'd like to confirm that I'm moving in the right direction before I make any further updates.

What it does

  1. These changes update cloak.migrate so that it re-encrypts fields as it migrates. This allows the migration to retire unused keys and remove them after migrating.

  2. This code is using a slower SQL query directly against the encrypted fields by comparing the version tag prepended to data by cloak with the current cloak.version and migrating/re-encrypting each field in that row. This isn't necessary if using a cloak_version field that's written alongside the data, but it simplifies managing data at the expense of query speed. In our case, we have multiple encrypted fields in one record, so it helps us avoid having to re-encrypt all fields with every row update.

  3. When configuring migrations, a user can now specify just a schema without any fields, and the migration task will find any fields that match any of the custom Ecto types added by cloak and re-encrypt them with the current cipher and key

Notes: We're using ecto.force_change to force re-encryption of otherwise unaltered data in the database update for the record.

snarkitect commented 6 years ago

Apologies again, this was written a day or two ago against master. There's some refactoring that will need to be done to fit in with some of the new changes.

danielberkompas commented 6 years ago

You raise good points with this PR. Unfortunately, as you can see, Cloak has changed quite a bit in preparation for a 0.7 prerelease.

The existing cloak.migrate task is based on the false premise that Ecto would dump all the fields back to the database even if they were unchanged. Without that, the encryption_version field isn't all that useful, because as you say, just because the row was changed doesn't mean that every field was re-encrypted.

This is going to take some work to solve. I'd like to take a crack at it first if you don't mind, as I'm the most familiar with the Cloak 0.7 architecture at this point.

snarkitect commented 6 years ago

Thanks for your quick response!

If you're happy to take a look at it, please do! What do you think about the that part of the solution I was working toward here? If we can specify encrypted fields per schema (with or without the version field), we can call Ecto.Changeset.force_change on each field in migrate_row before calling repo.update. This should behave as expected: https://github.com/snarkitect/cloak/blob/1767f4b67f6edb8a7dd00065002655afcd182231/lib/mix/tasks/cloak.migrate.ex#L184

If we want to keep configuration as it stands, we could still infer the encrypted fields using something like this: https://github.com/snarkitect/cloak/blob/1767f4b67f6edb8a7dd00065002655afcd182231/lib/mix/tasks/cloak.migrate.ex#L108

danielberkompas commented 6 years ago

Closing in favor of #62, thanks very much for the bug report though! :star: :star: :star: