vesoft-inc / nebula-br

Backup and restore utility for Nebula Graph
Apache License 2.0
15 stars 11 forks source link

BR faield to restore backup and report and incorrect topology error #52

Closed kikimo closed 1 year ago

kikimo commented 1 year ago

Please check the FAQ documentation before raising an issue

Describe the bug (required)

We have a 7 storages cluster and a space with one replicas and one part, after backup the cluster, we try to restore the backup but got an unexpected error report:

...
 bytes=951.","time":"2022-11-29T11:49:25.309Z"}
{"file":"github.com/vesoft-inc/nebula-agent@v0.1.1/pkg/client/client.go:47","func":"github.com/vesoft-inc/nebula-agent/pkg/client.New","level":"debug","msg":"
Dialing to address store1:8888.","time":"2022-11-29T11:49:25.309Z"}
{"error":"get service status in host store1 failed: agent, get service status failed: rpc error: code = Unknown desc = get STORAGE status by daemon failed: ex
it status 127","file":"github.com/vesoft-inc/nebula-br/pkg/restore/fix.go:181","func":"github.com/vesoft-inc/nebula-br/pkg/restore.retry","level":"info","msg"
:"Get dead services failed, try times=1.","time":"2022-11-29T11:49:25.311Z"}
{"file":"github.com/vesoft-inc/nebula-agent@v0.1.1/pkg/client/client.go:47","func":"github.com/vesoft-inc/nebula-agent/pkg/client.New","level":"debug","msg":"
Dialing to address store7:8888.","time":"2022-11-29T11:49:26.312Z"}
{"error":"get service status in host store7 failed: agent, get service status failed: rpc error: code = Unknown desc = get STORAGE status by daemon failed: ex
it status 127","file":"github.com/vesoft-inc/nebula-br/pkg/restore/fix.go:181","func":"github.com/vesoft-inc/nebula-br/pkg/restore.retry","level":"info","msg"
:"Get dead services failed, try times=2.","time":"2022-11-29T11:49:26.314Z"}
{"file":"github.com/vesoft-inc/nebula-agent@v0.1.1/pkg/client/client.go:47","func":"github.com/vesoft-inc/nebula-agent/pkg/client.New","level":"debug","msg":"Dialing to address graph1:8888.","time":"2022-11-29T11:49:28.315Z"}
{"error":"get service status in host graph1 failed: agent, get service status failed: rpc error: code = Unknown desc = get GRAPH status by daemon failed: exit
 status 127","file":"github.com/vesoft-inc/nebula-br/pkg/restore/fix.go:181","func":"github.com/vesoft-inc/nebula-br/pkg/restore.retry","level":"info","msg":"
Get dead services failed, try times=3.","time":"2022-11-29T11:49:28.318Z"}
Fix failed when restore failed get service status in host graph1 failed: agent, get service status failed: rpc error: code = Unknown desc = get GRAPH status b
y daemon failed: exit status 127
Error: physical topology not consistent: the physical topology of storage count must be consistent, cluster: 7, backup: 1
Usage:
  br restore full [flags]

Flags:
  -h, --help   help for full

Global Flags:
      --concurrency int        Max concurrency for download data (default 5)
      --debug                  Output log in debug level or not
      --log string             Specify br detail log path (default "br.log")
      --meta string            Specify meta server
      --name string            Specify backup name
      --s3.access_key string   S3 Option: set access key id
...

the error reports that "Error: physical topology not consistent: the physical topology of storage count must be consistent, cluster: 7, backup: 1" is obviously incorrect since we never change the cluster topology.

nebula-agent version:

# /root/nebula-chaos-cluster/bin/nebula-agent version
{"file":"./agent.go:31","func":"main.main","level":"info","msg":"Start agent server...","time":"2022-11-29T11:52:10.679Z","version":"96646b8"}
{"error":"listen tcp: address auto: missing port in address","file":"./agent.go:44","func":"main.main","level":"fatal","msg":"Failed to listen: auto.","time":"2022-11-29T11:52:10.679Z"}

nebula-br version:

# ../../bin/nebula-br version
Nebula Backup And Restore Utility Tool,V-3.3.0
   GitSha: 7ea9282-dirty
   GitRef: master

Your Environments (required)

How To Reproduce(required)

Steps to reproduce the behavior:

  1. Step 1
  2. Step 2
  3. Step 3

Expected behavior

Additional context

kqzh commented 1 year ago

use br-ent solved