riboseinc / transcryptor

Assists your everyday re-encryption needs, in Rails.
MIT License
2 stars 1 forks source link

Support zero-downtime migration #5

Closed ribose-jeffreylau closed 7 years ago

ribose-jeffreylau commented 7 years ago

Roadmap #3

(foreseen to depend on #4)

http://blog.honeybadger.io/zero-downtime-migrations-of-large-databases-using-rails-postgres-and-redis/

Migrating large datasets takes time. Making key migrations not incur downtime would be crucial in time-critical production environments.

The above link outlines what's needed to facilitate this task.

GildedHonour commented 7 years ago

I'll work on this.

ribose-jeffreylau commented 7 years ago

Hi @DmitryDrobotov , any thoughts on this?

ribose-jeffreylau commented 7 years ago

Perhaps extending (forking) attr_encrypted to make it support things like the following?

  1. Specify old key config in a attr_encrypted_compat in the Model class:

    class User < ActiveRecord::Base
    attr_encrypted :ssn,
                 key: ->(u) { ENV['NEXT_USER_SSN_ENC_KEY'] || ENV['USER_SSN_ENC_KEY'] },
                 mode: :per_attribute_iv_and_salt,
                 algorithm: 'aes-256-cbc'
    
    # old configuration of attr_encrypted for :ssn column
    attr_encrypted_compat :ssn,
                        key: ->(u) { ENV['OLD_USER_SSN_ENC_KEY'] || ENV['USER_SSN_ENC_KEY'] },
                        mode: :per_attribute_iv_and_salt,
                        algorithm: 'aes-256-gcm'
    
    # ...
    end
  2. db:migrate to create new_ssn, new_ssn_salt, new_ssn_iv (to which I'll refer by new_ssn_*) accordingly:

    +---------+------+
    | new_ssn | ...  |
    +---------+------+
    | ssn     | ...  |
    +---------+------+
    | ...     | ...  |
    +---------+------+

    Rails would then be made to write to both new_ssn_* and ssn_*, while reading only from ssn_*.

  3. A background db:migrate could then run to fill the new_ssn_* columns.

    • This step is done using transcryptor.
  4. When all new_ssn_* fields are filled, db:migrate to remove ssn_* and then rename new_ssn_* to ssn_*:

    +---------+------+
    | ssn     | ...  |
    +---------+------+
    | ...     | ...  |
    +---------+------+
  5. Rails picks up that there's no more new_ssn_* columns, so it's back to writing solely to ssn_* again.

  6. Deploy next version of Rails app without the attr_encrypted_compat lines. Done.

In summary, this extension would make Rails write to field_name_* (and new_field_name_* if exist), while reading from ssn_* in all cases (i.e. unchanged).

What do you think?

cc: @DmitryDrobotov

DmitryDrobotov commented 7 years ago

@ribose-jeffreylau sounds good!

  1. We need to specify both configurations in model (old and new) 👍
  2. That's exactly should be separate DB columns 👍
  3. We need to provide mechanism to read/create/update both (new and old) columns at the same time and fill new columns if it is not filled. And all these actions should be available through old name of attribute (ssn in you example)

Absolutely great solution @ribose-jeffreylau !

ribose-jeffreylau commented 7 years ago

Awesome! Do you think this belongs in a new repo or should it remain inside transcryptor for the time being?

DmitryDrobotov commented 7 years ago

I think we can implement it as part of transcryptor.

ribose-jeffreylau commented 7 years ago

👍

ribose-jeffreylau commented 7 years ago

As discussed in #15, transcryption ordering has to be consistent and logging has to be detailed enough to ensure that interruptions to the transcryption process can be resumed in a controlled manner.

ribose-jeffreylau commented 7 years ago

Some deduplication of configs could perhaps be done here, e.g.,

class User < ActiveRecord::Base
  attr_transcryptor :ssn,
                    old: {
                      key: proc { |user|
                        ENV['NEXT_USER_SSN_ENC_KEY'] || ENV['USER_SSN_ENC_KEY']
                      },
                      mode: :per_attribute_iv_and_salt,
                      algorithm: 'aes-256-cbc'
                    },
                    new: {
                      key: proc { |user|
                        ENV['OLD_USER_SSN_ENC_KEY'] || ENV['USER_SSN_ENC_KEY']
                      },
                      mode: :per_attribute_iv,
                      algorithm: 'aes-256-gcm'
                    }

  # ...
end

attr_transcryptor could be just a wrapper to the above attr_encrypted and attr_encrypted_compat but just a bit DRYer (no repeated information on the attribute name, prefix, suffix, etc.).

Just an idea. Not very important at the moment but could help in the future when we abstract away from attr_encrypted.

ribose-jeffreylau commented 7 years ago

Hi @DmitryDrobotov , actually attr_transcryptor is a bit urgent. Would you have time to finish this by this weekend? Thanks! 👍

DmitryDrobotov commented 7 years ago

It feels like to implement attr_encrypted_compat will be easier rather than extend attr_transcryptor.

Wow, it will be really hard to implement zero-downtime migration in time. Let's try, but I can not guarantee.

ribose-jeffreylau commented 7 years ago

😄 👍 Let's go with attr_encrypted_compat as our first step, after which we can consider implementing the attr_transcryptor wrapper. 🙇 🙇

ronaldtse commented 7 years ago

@DmitryDrobotov just to confirm that you're working on this. Thanks! 👍

DmitryDrobotov commented 7 years ago

@ronaldtse it is done https://github.com/riboseinc/transcryptor/pull/29. Sorry, used wrong keyword to close issue automatically :(