DMPRoadmap / roadmap

DCC/UC3 collaboration for a data management planning tool
MIT License
102 stars 109 forks source link

Migration step 3 fails #108

Closed briri closed 7 years ago

briri commented 7 years ago

I received an error when running 'db:migrate' for the 3rd step (final_schema). It complained that devise did not exist.

related to #107

briri commented 7 years ago

got around it with:

cp config/initializers/devise.rb.example config/initializers/devise.rb

I then get the following error after it runs through a bunch of the migrations:

StandardError: An error has occurred, all later migrations canceled:

uninitialized constant Org::OrganisationType
/Users/briley/Documents/workspace/roadmap/db/migrate/20161205095624_replacing_organisation_types_with_bitflags.rb:18:in `change'
NameError: uninitialized constant Org::OrganisationType
/Users/briley/Documents/workspace/roadmap/db/migrate/20161205095624_replacing_organisation_types_with_bitflags.rb:18:in `change'
Tasks: TOP => db:migrate
(See full trace by running task with --trace)
briri commented 7 years ago

Issues still exist on my second test machine.

I think our best course of action will be to create branches from those tagged commits and make corrections in the branches. Then update the documentation to have the users checkout the branch instead of the tag.

This should work since the tweaks we need to get the migrations to run are no longer relevant in the current development/master branches.

My suggestion is to do the following for each step:

git checkout aa9aa0051e6711e42b3d7fb73d8bd78cf3db1d6a
git checkout -b migration-from-dmponline-step-1
# Make any necessary changes
git push origin migration-from-dmponline-step-1

We would then just need to update the documentation afterward and make sure that we preserve those branches (you can 'protect' branches in the Github repo settings)

I'm happy to do this tomorrow if you guys agree that its a good approach

vyruss commented 7 years ago

Every scenario I went through involving history rewriting & rebasing results in problems with our repository, in the sense that these commits have already been pushed to development and some branches (such as @jollopre's) have been based on development.

Therefore I think that I agree in that the best course of action is to fix the two branches by amending whatever code exists at the final commit of each branch.

That will be easy because we've kept them around as:

Upside is, if we push these branches to Roadmap, we can also push the exact same tag names, they will just be pointing to a different commit from a different branch.

vyruss commented 7 years ago
briri commented 7 years ago

excellent thanks @vyruss I'll run through the process again today using the updated instructions

briri commented 7 years ago

The first branch looks like it will work. The second doesn't address the 'uninitialized constant Org::OrganisationType' issue I mentioned above.

I'll run through though today now that we have a branch we can make the necessary adjusts and then just repoint the tag to the new commit.

Also, should we maybe add a final step to the instructions that puts them back on the master (development currently) branch and runs db:migrate again?

briri commented 7 years ago

Added the OrganisationType and UsersPerm models in to fix uninitialized constant errors. Now encountering a:

StandardError: An error has occurred, all later migrations canceled:

Unknown primary key for table users_perms in model UsersPerm.
briri commented 7 years ago

Further issues with this one.

Mysql2::Error: Cannot add or update a child row: a foreign key constraint fails (`roadmap`.`#sql-6dd8_2fce`, CONSTRAINT `fk_rails_7d708f6f1e` FOREIGN KEY (`theme_id`) REFERENCES `themes` (`id`)): ALTER TABLE `themes_in_guidance` ADD CONSTRAINT `fk_rails_7d708f6f1e`
FOREIGN KEY (`theme_id`)
  REFERENCES `themes` (`id`)
/apps/dmp/apps/roadmap/roadmap/db/migrate/20161206122926_add_foreign_keys.rb:71:in `change'

I am using the original db/data.yml to create the starting point for the DB. I believe this was generated around the September timeframe

Logging into the DB I can see:

mysql> select * from themes_in_guidance where theme_id not in (select id from themes);
+----------+-------------+
| theme_id | guidance_id |
+----------+-------------+
|        1 |           4 |
|        2 |           2 |
|        3 |           3 |
|        4 |           1 |
+----------+-------------+
4 rows in set (0.01 sec)

mysql> select * from themes_in_guidance where guidance_id not in (select id from guidances);
+----------+-------------+
| theme_id | guidance_id |
+----------+-------------+
|        3 |           3 |
|        4 |           1 |
+----------+-------------+
2 rows in set (0.00 sec)
briri commented 7 years ago

Added step to remove orphaned records prior to creating fkeys on themes_in_guidance table: https://github.com/CDLUC3/roadmap/commit/e2e700becbbdb93ca9955bb292d21ea5d7dcaee3

Added stubs of the OrganisationType and UsersPerm models back so that we no longer receive 'uninitialized constant' errors https://github.com/CDLUC3/roadmap/commit/60ce0eef529b390022dd5a71fdf20b2433623917

Replaced UsersPerm scrubbing section with a direct deletion of orphaned records instead: https://github.com/CDLUC3/roadmap/commit/3ef06296816e9dfa296709b6a270edc124b1d764

briri commented 7 years ago

Did some review of the DB after the migration and am seeing an issue:

vyruss commented 7 years ago

@xsrust can you help out with this? I did not encounter any of the above problems when migrating the DMPonline DB

briri commented 7 years ago

ok based on @xsrust suggestion I took a look at the 20161021 migration file and see the rails environment check. I am not running as the test environment though so it shouldn't be a factor for any of us.

The data.yml file I was using to seed the DB was from Oct. 26th and already had all of the migrations we had created from June/July through to October 26th (api tokens, etc.).

Jimmy sent me a dump of the data to use which represents the current dmponlinev4 system so I started with that and ran the following steps (after copying the data.yaml and schema.rb into my db/):

It fails here after a while on the migration step mentioned above with (note that I removed the actual user name and first part of the domain):

D, [2017-03-07T15:52:07.341987 #28070] DEBUG -- :   User Exists (0.5ms)  SELECT  1 AS one FROM `users` WHERE (`users`.`email` = BINARY 'xxxxxxx@xxx.i' AND `users`.`id` != 3151) LIMIT 1
D, [2017-03-07T15:52:07.343169 #28070] DEBUG -- :    (0.5ms)  ROLLBACK
rake aborted!

StandardError: An error has occurred, all later migrations canceled:

Validation failed: Email is not a valid email address
/apps/dmp/apps/roadmap/roadmap/db/migrate/20161021100420_single_organisation_for_users.rb:14:in `block in up'

That looks like it is indeed a bad email address. There must be at least 2 characters in that last part of the domain. The question then is should we comment out our new email validator or should we get rid of the invalid data? If opt for commenting it out during the migration and then reintroduce it afterward, the user could have trouble logging in since the devise gem records things like last_sign_in_at

I'm not sure how you guys are not running into issues like this. I'm using the codebase currently in DMPRoadmap/roadmap development and the data I received today.

vyruss commented 7 years ago

Whoops! Totally forgot to mention we disabled the email validator for the existing database as we knew we had bad email addresses in that data, sorry about that

briri commented 7 years ago

so what is your plan moving forward with those bad emails?

Do you all want me to spend some time doing some bullet-proofing of the migration process today? I can put in checks for other potential orphaned records and checks for bad emails that log the ignored/deleted information to a migration log.

vyruss commented 7 years ago

It won't be possible for users to enter bad emails from now on because of the validator.

briri commented 7 years ago

Once the validator is in place it will be problematic. Anytime user.save is called one of those records it will fail due to the email validation. My concern is that the Devise gem works behind the scenes and manages some of the fields on the users table (e.g. last_sign_in_at, etc.). It is likely calling user.save in those cases.

I'm not sure how many records we're talking about, so it might be enough to just reach out to them after the migration and ask them to provide the correct email address.

marisastrong commented 7 years ago

Is the email a required field? If not, I would say delete the email fields that are invalid and re-run. Since the emails aren’t valid, NULL is better than invalid. If the email is required, change to a valid test email unknwonemail@something.commailto:unknwonemail@something.com so they pass the validation.

From: Brian Riley notifications@github.com Reply-To: DMPRoadmap/roadmap reply@reply.github.com Date: Wednesday, March 8, 2017 at 11:54 AM To: DMPRoadmap/roadmap roadmap@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: Re: [DMPRoadmap/roadmap] Migration step 3 fails (#108)

Once the validator is in place it will be problematic. Anytime user.save is called one of those records it will fail due to the email validation. My concern is that the Devise gem works behind the scenes and manages some of the fields on the users table (e.g. last_sign_in_at, etc.). It is likely calling user.save in those cases.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/DMPRoadmap/roadmap/issues/108#issuecomment-285150139, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AAxeL_7QajYzbUwHBavAXJRNsCTdA7XRks5rjweCgaJpZM4MNyq1.

briri commented 7 years ago

Started refactoring the migration processes.

Finished up through the migration of Template and Plan structures. Have yet to remove the temporary tables and finish up the other minor tasks.

I'm working in 2 new branches: Step1: https://github.com/CDLUC3/roadmap/tree/dmponline4_upgrade_start Step2: https://github.com/CDLUC3/roadmap/tree/dmponline4_upgrade_refactor

vyruss commented 7 years ago

After discussing with @briri and @xsrust, I've pushed these migration branches to Roadmap:

Step 1: https://github.com/DMPRoadmap/roadmap/tree/dmponline4_upgrade_step1 Step 2: https://github.com/DMPRoadmap/roadmap/tree/dmponline4_upgrade_step2 Step 3: https://github.com/DMPRoadmap/roadmap/tree/dmponline4_upgrade_step3

TODO:

  1. Add migrate:dataintegrity rake task as _step1's final commit (@briri)
  2. Tag these branches' last commits with v0.1.9, v.0.2.0, v.0.2.1 tags (@vyruss)
  3. Rewrite migration HOWTO to better reflect new workflow (@vyruss)
briri commented 7 years ago

Added migrate:data_integrity check which deletes any orphaned records in the following tables: answers_options, dmptemplates_guidance_groups, guidance_in_group, plan_sections, project_groups, project_guidance, questions_themes, themes_in_guidance, users_roles

It also removes invalid email addresses and replaces them with a unique replacement email. It logs those replacements in log/migration.log and lets the user know that it has written out to that log file.

W, [2017-03-10T12:24:53.132664 #24293]  WARN -- : Replacing invalid email address for name: rniemi@jyu.i, id: 3151, email: rniemi@jyu.i with a3edae15-91e5-4a7d-9ad1-98762e62d28b@replacement-email.org

Also added a default config.secret_key to config/initializers/devise.rb in step1 and step 2 due to an error I was receiving when running the rake tasks.

These are the steps needed to run the full migration:

@vyruss please run the updated migration against your instances (using postgres) to verify that they work. I needed to make 2 different queries to select invalid email addresses due to the way mysql and postgres handle regular expressions.

@xrust, if all goes well with Jimmy's tests, please go ahead and repoint the migration tags at the latest commits in these branches: https://github.com/DMPRoadmap/roadmap/tree/dmponline4_upgrade_step1 https://github.com/DMPRoadmap/roadmap/tree/dmponline4_upgrade_step2 https://github.com/DMPRoadmap/roadmap/tree/dmponline4_upgrade_step3

briri commented 7 years ago

ok @xsrust as agreed, I did not create new tags. I updated the instructions and added some notes that I thought were important for pre/post upgrade. https://github.com/DMPRoadmap/roadmap/wiki/Upgrading-from-DMPonline_v4

xsrust commented 7 years ago

I followed the instructions on the wiki and it worked exactly as expected with no errors. I think branches is the best approach as we can create modifications to the procedure if anything else turns up that we need to change.

briri commented 7 years ago

closing since there is nothing to PR. The branches are already on DMPRoadmap/roadmap development branch