golang-migrate / migrate

Database migrations. CLI and Golang library.
Other
15.49k stars 1.41k forks source link

Stale Lock ID causng tool to be inoperable #712

Open rhodee opened 2 years ago

rhodee commented 2 years ago

Describe the Bug

While performing a routine migration on a CockroachDB Cloud cluster I began to encounter an error that appears to be related to the schema_lock table. I ran a few bad up and down migrations. Once I finally figured out the issue I was unable to migrate the db.

Steps to Reproduce Steps to reproduce the behavior:

Migrations

up

BEGIN;

SAVEPOINT cockroach_restart;

DROP INDEX IF EXISTS my_table@my_table_idx CASCADE;

RELEASE SAVEPOINT cockroach_restart;

COMMIT;
  1. I ran migrate with the following options '....'
migrate --path '...' --database "cockroachdb://user:pass@free-tier.gcp-us-central1.cockroachlabs.cloud:26257/dbname?sslmode=verify-full&sslrootcert=root.crt"
  1. See error

error: can't acquire lock

Expected Behavior A clear and concise description of what you expected to happen.

The schema_migrations table to be dirty and the schema_lock to have no entries, allow the user to run migrate force V and then migrate up|down.

Migrate Version e.g. v3.4.0 Obtained by running: migrate -version

dev

Built using

go install -tags 'cockroachdb' github.com/golang-migrate/migrate/v4/cmd/migrate@v4.15.1

Loaded Source Drivers e.g. s3, github, go-bindata, gcs, file Obtained by running: migrate -help

Source drivers: file

Loaded Database Drivers e.g. spanner, stub, clickhouse, cockroachdb, crdb-postgres, postgres, postgresql, pgx, redshift, cassandra, cockroach, mysql Obtained by running: migrate -help

Database drivers: cockroachdb, crdb-postgres, stub, cockroach

Go Version e.g. go version go1.11 linux/amd64 Obtained by running: go version

go version go1.17.5 darwin/amd64

Stacktrace Please provide if available

Additional context Add any other context about the problem here.

Can the lock_id row be deleted and then the migrate tool be re-run? From this https://github.com/golang-migrate/migrate/blob/v4.14.1/database/cockroachdb/cockroachdb.go#L173 it appears it would not be too much of an issue, but I am unsure how wise this is. Any guidance on what to do would be great.

SELECT *
FROM schema_lock;
lock_id
647836267
SELECT *
FROM schema_migrations;
version dirty
20220311012135 true
dhui commented 2 years ago

The lock is used to prevent multiple instance of migrate from running migrations simultaneously. If you're sure no other instances of migrate are running or will run, then it is safe to manually remove the lock. Fixing botched migrations are never fun... Good luck!!! 🤞

chanced commented 2 years ago

I ran into this as well. The lock doesn't seem to expire.

Should Run delete the lock regardless of pass/fail or does is that part of the dirty tracking / force logic?

dadebue commented 1 year ago

I just ran into the same issue as well. Deleting the schema_lock row "fixes" the issue though...

dtoubelis commented 1 month ago

Have anyone figured out the root cause? We are experiencing the same issue intermittently and we cannot rely on our automated pipelines to deploy... this even happens when no migrations are expected to be the process fails catastrophically and require a manual intervention.