scylladb / scylla-manager

The Scylla Manager
https://manager.docs.scylladb.com/stable/
Other
48 stars 33 forks source link

Manager 3.3.0-dev fails to re-execute the backup task done by Manager 3.2.8, Scylla 2022.1 #3903

Closed mikliapko closed 6 days ago

mikliapko commented 1 week ago

Preconditions:

Steps:

  1. Upgrade the Manager to 3.3.0-dev;
  2. Re-execute the backup task from preconditions: PUT /api/v1/cluster/3b4b3574-d1a4-4fcd-8082-789108a816b8/task/backup/3603fd9f-8bb7-4056-83ef-fb1c7beabb10/start?continue=false

Actual result: The task fails (ERROR) with error "status":"ERROR","cause":"await schema: dump schema: describe schema with internals: line 1:0 no viable alternative at input 'DESCRIBE'"

Jun 25 09:28:17 manager-upgrade-manager--monitor-node-338ecc62-1 scylla-manager[22613]: {"L":"ERROR","T":"2024-06-25T09:28:17.493Z","N":"backup.await_schema","M":"Awaiting schema agreement failed see exact errors above","duration":"42.660534ms","_trace_id":"H0bYorOfTBi0vPgmYujEJA","S":"github.com/scylladb/go-log.Logger.log\n\tgithub.com/scylladb/go-log@v0.0.7/logger.go:101\ngithub.com/scylladb/go-log.Logger.Error\n\tgithub.com/scylladb/go-log@v0.0.7/logger.go:84\ngithub.com/scylladb/scylla-manager/v3/pkg/service/backup.(*Service).Backup.func11.1\n\tgithub.com/scylladb/scylla-manager/v3/pkg/service/backup/service.go:899\ngithub.com/scylladb/scylla-manager/v3/pkg/service/backup.(*Service).Backup.func11\n\tgithub.com/scylladb/scylla-manager/v3/pkg/service/backup/service.go:907\ngithub.com/scylladb/scylla-manager/v3/pkg/service/backup.(*Service).Backup\n\tgithub.com/scylladb/scylla-manager/v3/pkg/service/backup/service.go:911\ngithub.com/scylladb/scylla-manager/v3/pkg/service/backup.Runner.Run\n\tgithub.com/scylladb/scylla-manager/v3/pkg/service/backup/runner.go:26\ngithub.com/scylladb/scylla-manager/v3/pkg/service/scheduler.PolicyRunner.Run\n\tgithub.com/scylladb/scylla-manager/v3/pkg/service/scheduler/policy.go:32\ngithub.com/scylladb/scylla-manager/v3/pkg/service/scheduler.(*Service).run\n\tgithub.com/scylladb/scylla-manager/v3/pkg/service/scheduler/service.go:451\ngithub.com/scylladb/scylla-manager/v3/pkg/scheduler.(*Scheduler[...]).asyncRun.func1\n\tgithub.com/scylladb/scylla-manager/v3/pkg/scheduler/scheduler.go:404"}
Jun 25 09:28:17 manager-upgrade-manager--monitor-node-338ecc62-1 scylla-manager[22613]: {"L":"INFO","T":"2024-06-25T09:28:17.502Z","N":"scheduler.3b4b3574","M":"Run ended with ERROR","task":"backup/3603fd9f-8bb7-4056-83ef-fb1c7beabb10","status":"ERROR","cause":"await schema: dump schema: describe schema with internals: line 1:0 no viable alternative at input 'DESCRIBE'","duration":"578.52041ms","_trace_id":"H0bYorOfTBi0vPgmYujEJA"}
Jun 25 09:28:17 manager-upgrade-manager--monitor-node-338ecc62-1 scylla-manager[22613]: {"L":"ERROR","T":"2024-06-25T09:28:17.502Z","N":"scheduler.3b4b3574","M":"OnRunError","key":"3603fd9f-8bb7-4056-83ef-fb1c7beabb10","retry":0,"error":"await schema: dump schema: describe schema with internals: line 1:0 no viable alternative at input 'DESCRIBE'","errorStack":"github.com/scylladb/scylla-manager/v3/pkg/service/backup.(*Service).Backup.func11\n\tgithub.com/scylladb/scylla-manager/v3/pkg/service/backup/service.go:907\ngithub.com/scylladb/scylla-manager/v3/pkg/service/backup.(*Service).Backup\n\tgithub.com/scylladb/scylla-manager/v3/pkg/service/backup/service.go:911\ngithub.com/scylladb/scylla-manager/v3/pkg/service/backup.Runner.Run\n\tgithub.com/scylladb/scylla-manager/v3/pkg/service/backup/runner.go:26\ngithub.com/scylladb/scylla-manager/v3/pkg/service/scheduler.PolicyRunner.Run\n\tgithub.com/scylladb/scylla-manager/v3/pkg/service/scheduler/policy.go:32\ngithub.com/scylladb/scylla-manager/v3/pkg/service/scheduler.(*Service).run\n\tgithub.com/scylladb/scylla-manager/v3/pkg/service/scheduler/service.go:451\ngithub.com/scylladb/scylla-manager/v3/pkg/scheduler.(*Scheduler[...]).asyncRun.func1\n\tgithub.com/scylladb/scylla-manager/v3/pkg/scheduler/scheduler.go:404\nruntime.goexit\n\truntime/asm_amd64.s:1695\n","S":"github.com/scylladb/go-log.Logger.log\n\tgithub.com/scylladb/go-log@v0.0.7/logger.go:101\ngithub.com/scylladb/go-log.Logger.Error\n\tgithub.com/scylladb/go-log@v0.0.7/logger.go:84\ngithub.com/scylladb/scylla-manager/v3/pkg/scheduler.errorLogListener[...].OnRunError\n\tgithub.com/scylladb/scylla-manager/v3/pkg/scheduler/listener.go:86\ngithub.com/scylladb/scylla-manager/v3/pkg/scheduler.(*Scheduler[...]).onRunEnd\n\tgithub.com/scylladb/scylla-manager/v3/pkg/scheduler/scheduler.go:420\ngithub.com/scylladb/scylla-manager/v3/pkg/scheduler.(*Scheduler[...]).asyncRun.func1\n\tgithub.com/scylladb/scylla-manager/v3/pkg/scheduler/scheduler.go:405"}
Jun 25 09:28:17 manager-upgrade-manager--monitor-node-338ecc62-1 scylla-manager[22613]: {"L":"INFO","T":"2024-06-25T09:28:17.502Z","N":"scheduler.3b4b3574","M":"Retry backoff","task":"backup/3603fd9f-8bb7-4056-83ef-fb1c7beabb10","backoff":"10m0s","retry":1}

Expected result: The task final status is DONE.

Environment:

Additional info:

karol-kokoszka commented 1 week ago

~We have the pre-check when SM dumps the schema that checks if current scylla server supports DESCRIBE SCHEMA WITH INTERNALS... For some reason it reported that it does.~

~https://github.com/scylladb/scylla-manager/blob/24999e1c46890fc3d3030e76cf0fae47579d51fd/pkg/scyllaclient/client_agent.go#L185-L204~

~Trying to reproduce with integration tests from the repo.~

karol-kokoszka commented 1 week ago

reproduced with integration test against 2022.1

15:47:11.258    ERROR   backup.await_schema     Awaiting schema agreement failed see exact errors above {"duration": "30.203887ms"}
github.com/scylladb/go-log.Logger.log
        /home/karkok/dev/scylla-manager/vendor/github.com/scylladb/go-log/logger.go:101
github.com/scylladb/go-log.Logger.Error
        /home/karkok/dev/scylla-manager/vendor/github.com/scylladb/go-log/logger.go:84
github.com/scylladb/scylla-manager/v3/pkg/service/backup.(*Service).Backup.func11.1
        /home/karkok/dev/scylla-manager/pkg/service/backup/service.go:899
github.com/scylladb/scylla-manager/v3/pkg/service/backup.(*Service).Backup.func11
        /home/karkok/dev/scylla-manager/pkg/service/backup/service.go:907
github.com/scylladb/scylla-manager/v3/pkg/service/backup.(*Service).Backup
        /home/karkok/dev/scylla-manager/pkg/service/backup/service.go:911
github.com/scylladb/scylla-manager/v3/pkg/service/backup_test.TestBackupResumeIntegration.func8.2
        /home/karkok/dev/scylla-manager/pkg/service/backup/service_backup_integration_test.go:1236
    service_backup_integration_test.go:1241: Expected context error but got: line 1:0 no viable alternative at input 'DESCRIBE'
        describe schema with internals
karol-kokoszka commented 1 week ago

~version returned by given Scylla is 2022.1.0-0.20220727.55d7ad683~ ~It may be a problem with the check we have in SM.~

karol-kokoszka commented 1 week ago

We are actually always calling DESCRIBE SCHEMA WITH INTERNALS. It's expected.

@tzach Scylla Manager 3.3 is not going to support Scylla-Enterprise 2022.x (as per release notes).

I understand it's fine ?

There would be no possibility to restore schema if the backup is done old way (without DESCRIBE SCHEMA WITH INTERNALS) on Scylla Server with raft enabled. On the other hand, if someone will use Scylla Manager 3.3 on Scylla that doesn't support DESCRIBE SCHEMA WITH INTERNALS then all backups will fail.

Should we bring back the support for 2022.x ?

karol-kokoszka commented 6 days ago

We are gonna address it.

Whenever version of Scylla is < 6.0 or 2024.2 (the ones that are not supporting correctly DESCRIBE SCHEMA WITH INTERNALS ), then we are gonna fallback to the previous approach -> backup schema to CQL done through the driver (instead of calling CQL statement directly).

Such a situation must be put to the manager logs with WARNING that it will not be possible to restore schema from this backup in Scylla >= 6.0 and >= 2024.2.

It's gonna be included into the upcoming manager 3.3 release @tzach @mykaul @gmizrahi