riboseinc / transcryptor

Assists your everyday re-encryption needs, in Rails.
MIT License
2 stars 1 forks source link

Zero restart migration #33

Open ribose-jeffreylau opened 7 years ago

ribose-jeffreylau commented 7 years ago

Goal

  1. Seamless transition from one encryption scheme to another.
  2. Preferrably not to have to change Model code.

Implies

  1. No generated artifacts (e.g. .rb files) should break Rails.
  2. DB migrations should not break running Rails server.
  3. No need to modify Model code to add / rename / remove attr_encrypted columns.
  4. No restart of Rails server should be necessary, except for the first time when set-up code is deployed.
  5. Encryption configuration (encryption schema) changes are stored away from Model definitions. Perhaps in a config file, or even another DB table.

Potential tools

Instead of running rake tasks to run transcryptions, rails-data-migrations seems like a good alternative. It stores its migration history in the table data_migrations.

DB

Currently,

Column

Some ideas on encryption schema:

db/encryption_schema.yaml

---yaml
- class_name: User
  - column: ssn
    - config: { algo, salt, etc. }

migration:

db/transcryption/2017125_transcrypt_user_ssn.rb

class TranscryptUserSSN < Transcryption
  def old
    { algo, salt, etc. }
  end

  def new
    #
    # { algo, salt, etc. }
    #
    # or perhaps something like:
    #
    Transcryptor.schema(:user, :ssn)
  end
end

config:

app/models/user.rb

class User < AR
  attr_transcrypted :ssn
end

How to make this work? What else needs to be defined? What's the closest we can acheive?

cc: @DmitryDrobotov

ronaldtse commented 7 years ago

@DmitryDrobotov would you have time to start with this? Thanks!

DmitryDrobotov commented 7 years ago

@ribose-jeffreylau @ronaldtse I've spent yesterday in my thoughts how to implement so magic behavior and here is some point I realized.

  1. Database. To have a zero-downtime migration we exactly need to create separate columns OR have a column which indicates that encrypted attribute was re-encrypted OR create identical table and fill it with all data (anyway, too hard and heavy to sync all data through all time of migration)
  2. Model configuration. It is possible to store configuration somewhere else, but what should we do with pros or lambdas which can participate in configuration?
  3. During migration and at the end of it we need to replace old configuration with new one. How are you going to handle it?
  4. Transcryptor.schema(:user, :ssn) won't give us an versioning of data changes.
ribose-jeffreylau commented 7 years ago

@DmitryDrobotov

  1. For storing things like Procs in configuration, here's how I envision configurations could work:

Thus, any running server instances would use the latest migration version (version n) as the current config, and also read in the earliest unmigrated version (version n + 1) if exists.

A background rake task could then run migration version n + 1. The running server instance's DB would eventually pick up the new config via solution in point 1 (TBD).


  1. Would having a column indicating the migration version for a particular column work better / be easier?

  1. For a running server to pick up the latest config, the scheme detailed in point 2. could perhaps address this.

  1. So, in summary, the transcryption scheme now looks like:

db/transcryption/20171010123456_transcrypt_user_ssn.rb

class TranscryptUserSSN < Transcryption
  def old
    { algo, salt, etc. }
  end

  def new
    { algo, salt, etc. }
  end
end

and no more yaml files.

Hence, the configuration is implicitly the "head" of a chain of migration files. The "head" can be different for each row, indicated by a separate column (for each encrypted column) via the transcryption order ID (e.g. 20171010123456). The labels are usually different only during a transcryption, otherwise they should all point to the same version.

ronaldtse commented 7 years ago

I was actually thinking of a less drastic solution like this, so that the latest configuration is still located in the model, but the old one that we want to get rid of (in time) will be stored in the data migration, and still available if any server instances are still running on old code.

db/transcryption/20171010123456_transcrypt_user_ssn.rb

class TranscryptUserSSN < Transcryption
  def old
    { algo, salt, etc. }
  end

  def new
    UserModel.new_ssn_configuration
  end

  UserModel.register_old_configuration(:ssn, &old)
end
DmitryDrobotov commented 7 years ago

@ribose-jeffreylau @ronaldtse thanks guys! Now I see the picture how to implement that. We will need:

  1. Add column with version to table which has encrypted attribute (let's say it's name is ssn), then version column will have encrypted_ssn_version:decimal column and accepts values like 20171010123456
  2. Create first migration file under db/transcryption/20171010123456_transcrypt_user_ssn.rb (we can develop rake task to generate such files)
  3. Rails app fetches old config from migration file and reencrypts attribute with new configuration and updates encrypted_ssn_version column with the latest value.

Sounds great to me! What about you?

ronaldtse commented 7 years ago

@DmitryDrobotov this is actually a great solution!

This is actually something more generic than just for encrypted attributes (which may or may not be in a separate gem). This suggests that we have a versioned schema per column cell (i.e., each column cell will have a schema version).

For each schema-ed attribute column we will have an extra column to store the attribute schema version.

The flow works like this:

  1. Code is updated to use the new column schema (20171011000000) with the new model code, with link to the old column schema (20171010000000) in the migration file.

  2. Assume that all Rails instances are already updated to the new code. On read of the attribute, depending on the attribute schema, we can load the value according to old or new scheme. On write of the attribute, the value will be migrated to the new scheme.

  3. This way we support "lazy-migrate-on-write" or have a separate background process to migrate attributes to the new column schema with zero-downtime.

What do you think?

DmitryDrobotov commented 7 years ago

@ronaldtse Sounds good, that is the same what I meant except generic way like rails-data-migrations do. Let's think about it a little bit, do we really need to implement generic solution and add support in transcryptor gem? Or simply implement transcription migrations inside this gem?

ronaldtse commented 7 years ago

My concern was purely about scope -- if the test suite should test only against generic attribute schema it probably should be in/from a separate library.

Do you think these gems are useful for this functionality?

ribose-jeffreylau commented 7 years ago

Hi @DmitryDrobotov . What else do you think is needed before we can continue?

DmitryDrobotov commented 7 years ago

@ronaldtse I don't think that these gems can be useful. Because we need zero-downtime migrations of dynamicaly calculated value. For example, if row with id=1 has already been migrated it should use new configuration of attr_encrypted, but for row with id=1000, which has not migrated yet, we should still use previous configuration for re-encryption. So, I am going to implement something similar, but with some big changes, such as migration version control for separate row (not version of migration, like ActiveRecord migrations work).

nattfodd commented 6 years ago

Hey there! The only difference between zero downtime approach suggested in master branch and this one is that we have those few extra lines of code in the migrated model, which we want to get rid of now? Do I understand it right?

And what is the difference between this issue and that one: https://github.com/riboseinc/transcryptor/issues/24 ?

CC @ribose-jeffreylau

ronaldtse commented 6 years ago

@nattfodd

The major difference with #24 :

  1. Zero restart is for migrating each data cell (not a table/column) individually. The table and data already exists. The migration happens during production operation of the server and does not need to do any table modifications.

  2. There is no need for post operations on the data schema once the data migration is completed.

ribose-jeffreylau commented 6 years ago

Hi @nattfodd !

The mechanism detailed in this issue focuses on data migration. It tries not to touch db/migrations/ at all, as db/migrations/ also has other concerns such as database schema migrations.

24 concerns the bootstrapping of non-encrypted columns into encrypted columns, and vice versa, which will most probably change the DB schema.

Hope it makes things clearer :)

nattfodd commented 6 years ago

@ribose-jeffreylau sounds clearer, indeed. But I believe it should be extra temporary column anyway, as @DmitryDrobotov suggested right above. It may be done in a migration script though, something like:

# bundle exec rake transcryptor:nodowntime:migrate

add_column :users, :ssn_migrated, :boolean
# do some migrations
remove_column :users, :ssn_migrated

It will affect actual SQL DB schema, but it wont affect db/schema.rb and db/migrate/*.rb files...

ronaldtse commented 6 years ago

@nattfodd actually the solution discussed with @DmitryDrobotov is not a temporary column, but a permanent column.

The problem with a temporary column is that the database schema will be in flux -- for example, if multiple servers are reading values but suddenly the column is dropped (especially in many databases a column drop is implemented as a table adjust), it will cause havoc.

A straightforward solution is to have a separate, permanent column to manage "data cell schema", the data of which indicates the "data schema version" of the original cell. This is similar to putting the Rails schema version number in a more granular way, instead of a single version for the entire database, we have versions per value.

The benefits of a permanent column (or columns) are great:

The cost of using this is the implementation of a data layer that can switch (read/write) between data schema versions: