department-of-veterans-affairs / caseflow-efolder

Tool for bulk download of efolder claim files
16 stars 8 forks source link

Migration failures when deploying to production #345

Closed anyakhvost closed 7 years ago

anyakhvost commented 7 years ago

We have experienced a few issues with the migration when deploying code to production. The migration consisted of three things:

Here is the initial migration: https://github.com/department-of-veterans-affairs/caseflow-efolder/commit/9f616f0f184c0dce459b1854b25b0beb7c9857a4#diff-e5f0fedca0cff59f67d219d6fdc8699e

Issues:

  1. Downloads and searches had nil records for css_id (station_id), as a result, creation of user records would fail whenever there is a nil css_id
  2. The migration would fail when trying to call email method on the Download object. When the migration was created, email method existed in Download.rb; however, when the migration ran, the method was already removed by then.

Here is the resulting migration: https://github.com/department-of-veterans-affairs/caseflow-efolder/blob/master/db/migrate/20161222194523_create_users.rb

It would be great to brainstorm ideas to avoid things like that in the future:

  1. Run the migration against a sanitized copy of the production database beforehand
  2. Separate a large migration into smaller chunks
askldjd commented 7 years ago

Dumb question, how did this leak through the UAT environment?

anyakhvost commented 7 years ago

UAT database didn't have nil values for css_id so the migration worked fine. Also when the migration ran in UAT, the code at that time had email method in Download.rb @askldjd

joofsh commented 7 years ago

@aroltsch Thanks for putting this together!

Since we cannot expect UAT to mirror the prod dataset I agree with your assessment:

These should be our rules going forward when test/merging migrations into the codebase

shanear commented 7 years ago
shanear commented 7 years ago

Outcomes

1) Any destructive migration should be handled with care, and we should not let them stack up

1) Have a way to dump and sanitize prod DB PII. (captured here https://github.com/department-of-veterans-affairs/caseflow/issues/669)

1) In the mean time, we should have multiple reviewers closely review and approve any data migration scripts.