Ideas - Githubissues

ankane commented 5 years ago

I'm still on the fence about many of them. If anyone would like to discuss any of the ideas below, please create a new issue.

Likely ideas

[ ] Show friendly message for lock timeout - lock_timeout_message branch
[ ] Add support for multiple databases with target_version and other options - {primary: 10, other: 12}
[ ] Change default lock timeout to 5 seconds in generator

Other ideas

[ ] Option to retry lock timeouts - lock_timeout_retries branch
[ ] Condense add_foreign_key and change_column_null suggestions into a single migration (downside: failed migration can result in db state that needs manually reconciled) - add_foreign_key may be safe to do since it can detect locks

Decided against

Helper methods - better as a separate gem - see #111
Don't require safety_assured for remove_column if ignored_columns detected (ignored_columns will likely be removed after it runs in production, so this would cause errors for other devs)
Enforce timestamptz (opt-in) - apps can change globally in initializer if needed
Deprecate safety_assured for remove_index!, execute!, etc (no good reason to change for now)
rails g commands to generate safe migrations (like gindex gem)
Enforce reversible migrations (opt-in)
Enforce reversible operations (opt-in) - remove_column with type, change_column_default with to/from
Include null constraints in schema.rb - dump_null_constraints branch (or may be better to use execute command) - this shouldn't be needed in Postgres 12+, as well as Rails 6.1+

mvz commented 4 years ago

I'd be very interested in some helper method for renaming tables.

jjb commented 4 years ago

Gitlab has these migration helpers for their own internal use. They automagically set up triggers and such in order to do things like seamless column renaming.

https://gitlab.com/gitlab-org/gitlab/issues/34292

bheemreddy181-zz commented 4 years ago

Can we add a retry mechanism if migrations fail due to a timeout?

ankane commented 4 years ago

Hey @bheemreddy181, it's an interesting idea.

For statements run outside a transaction, retrying the statement should work:

class AddSomeIndexToUsers < ActiveRecord::Migration[6.0]
  disable_ddl_transaction!

  def change
    add_index :users, :some_column, algorithm: :concurrently
  end
end

For statements run inside a transaction, the entire transaction needs to be re-run. For migrations that only use Active Record's DDL transaction and have no side effects, retrying the entire migration should work:

class SimpleMigration < ActiveRecord::Migration[6.0]
  def change
    add_column :users, :some_column, :string
  end
end

However, retrying more complex migrations can cause issues or errors (which may be an acceptable trade-off).

class ComplexMigration < ActiveRecord::Migration[6.0]
  def change
    add_column :users, :some_column, :string
    commit_db_transaction # we can avoid retry if this method is called
    begin_db_transaction
    add_column :users, :other_column, :string
  end
end

Or

class ComplexMigration2 < ActiveRecord::Migration[6.0]
  disable_ddl_transaction!

  def change
    add_column :users, :some_column, :string

    connection.transaction do
      add_column :users, :other_column, :string
    end
  end
end

I've added the outside a transaction case for Postgres on the lock_timeout_retries branch if you want to try it. Enable retries with:

StrongMigrations.lock_timeout_retries = 2

bheemreddy181-zz commented 4 years ago

@ankane I will give a try and will let you know thanks for this feature, appreciate your work. Your contribution matters a lot.

bheemreddy181-zz commented 4 years ago

@ankane I started testing the retry feature started seeing the below error, am I missing something here

bheemreddy181-zz commented 4 years ago

@ankane Can you take a quick look whenever you get some time on the above issue, let me know if I am missing anything here?

bheemreddy181-zz commented 4 years ago

@ankane Can you take a quick look whenever you get some time on the above issue, let me know if I am missing anything here?

Can I get some help on this testing @ankane

timkoopmans commented 3 years ago

I also get this undef error on strong_migrations_checker ... it was working and now it's not, feels intermittetnt (load order?)

StandardError: An error has occurred, this and all later migrations canceled:

undefined method `strong_migrations_checker' for #<ActiveRecord::MigrationProxy:0x00007fa24ed45038>
/Users/timothykoopmans/.asdf/installs/ruby/2.6.6/lib/ruby/gems/2.6.0/bundler/gems/strong_migrations-716b64e73ceb/lib/strong_migrations/migrator.rb:6:in `ddl_transaction'
/Users/timothykoopmans/.asdf/installs/ruby/2.6.6/lib/ruby/gems/2.6.0/gems/activerecord-5.2.4.4/lib/active_record/migration.rb:1291:in `execute_migration_in_transaction'
/Users/timothykoopmans/.asdf/installs/ruby/2.6.6/lib/ruby/gems/2.6.0/gems/activerecord-5.2.4.4/lib/active_record/migration.rb:1263:in `block in migrate_without_lock'

ankane commented 3 years ago

@bheemreddy181 @timkoopmans Pushed a fix for that error, but running into some weird issues when retrying entire migrations (DB timeouts aren't being applied and the Ruby process hangs), so I wouldn't use the branch in production right now.

bheemreddy181-zz commented 3 years ago

@ankane i have seen the same thing while testing this feature , ruby process hangs until i remove my exclusive lock which i am holding from other terminal

Before removing lock Screen Shot 2020-12-17 at 5 01 09 PM

After removing lock Screen Shot 2020-12-17 at 5 02 24 PM

bheemreddy181-zz commented 3 years ago

@ankane any thoughts on this retry feature?

peret commented 3 years ago

Hi @ankane,

we are running into lock timeouts from time to time and would really appreciate the retry feature. I could switch to the lock_timeout_retries branch, but if I understand correctly, this still shouldn't be used in production? Is there anything I can help you with to solve this issue? I can spend some time to create a PR, if I'm able to debug it. Are there any more pointers/information that you came across, that could help? Also, should I open a separate issue for this?

peret commented 3 years ago

I found the issue :)

StrongMigrations::Checker.set_timeouts doesn't do anything if @timeouts_set is already true. That's why no DB timeouts are set on the second try and the migration waits forever to acquire the lock. I can prepare a PR if you want, but basically, setting @timeouts_set = false again in StrongMigrations::Checker.with_lock_timeout_retries before actually retrying the block fixed it for me.

This also explains why it's not an issue if no DDL transaction is used, I think, since in that case the DB timeouts won't be rolled back by the transaction?

bheemreddy181-zz commented 3 years ago

That makes complete sense - when no DDL transaction is used we don't need a timeout as concurrent index additions will let other transactions to flow through ( will let reads and writes on the table ) . Thanks for looking into it.

tute commented 3 years ago

We are also hitting lock timeout errors, and are considering alternatives for retries. Thanks for your work, Andrew!

tute commented 3 years ago

@ankane, why does your implementation only retry when !in_transaction?? I tried rebasing your branch but get "expected 2 tries and got 1", and it stays out due to that condition.


TimeoutsTest#test_lock_timeout_retries [/Users/tute/Sites/opensource/strong_migrations/test/timeouts_test.rb:125]:
Expected: 2
  Actual: 1

90 runs, 154 assertions, 1 failures, 0 errors, 0 skips```

Thank you.

ankane commented 2 years ago

Just added experimental support for lock timeout retries 🎉

@peret That was the issue - thanks!

@tute I don't fully follow the question, but check out the link above and latest test cases for a better idea of how it works.

Also, note that lock_timeout_delay has been renamed to lock_timeout_retry_delay from the branch.

If there's any feedback, feel free to create a new issue. Thanks all!

bheemreddy181-zz commented 2 years ago

Thanks a ton @ankane for considering this feature.

benoittgt commented 2 years ago

For

Don't require safety_assured for remove_column if ignored_columns detected (ignored_columns will likely be removed after it runs in production, so this would cause errors for other devs)

I think with safety_assured we should block migration if the column removed isn't in ignored_columns. Even if it will still break (adding ignored column in the same deployment), if the deployment was not done before with the ignored_columns, maybe this warning/raise could help do to not forget to do that.

Something like this:

We detect that :activate_at column is not in User's ignored_columns. You should deploy a version with the ignored columns before, and also keep having the ignored columns when running this migration.

At least if it's not blocking the migration. We should have a big warning. I am interested to work on that PR if you are ok @ankane

ankane commented 2 years ago

Hey @benoittgt I think the current approach provides a good balance of safety and simplicity, but feel free to share if you decide to fork and change it for your applications.

For future readers: please create a new issue to discuss any ideas.

ankane / strong_migrations

Ideas #95