zilliztech / milvus-backup

Backup and restore tool for Milvus
Apache License 2.0
111 stars 38 forks source link

[Bug]: milvus-backup 备份失败 #216

Open suchenghui opened 9 months ago

suchenghui commented 9 months ago

Current Behavior

./milvus-backup --version milvus-backup version 0.4.0 (Built on 2023-09-22T10:22:24Z from Git SHA 1eb59969e3c4038db678e942b723515fb88c6271) milvus 版本:v2.3.0

Expected Behavior

No response

Steps To Reproduce

./milvus-backup create -n backup_kg_prod -d kg_prod
config:backup.yaml
[2023/10/08 11:42:06.180 +08:00] [INFO] [logutil/logutil.go:165] ["Log directory"] [configDir=]
[2023/10/08 11:42:06.180 +08:00] [INFO] [logutil/logutil.go:166] ["Set log file to "] [path=logs/backup.log]
[2023/10/08 11:42:06.180 +08:00] [INFO] [core/backup_impl_create_backup.go:25] ["receive CreateBackupRequest"] [requestId=a6b58ad5-658c-11ee-b0b6-5254000770cb] [backupName=backup_kg_prod] [collections="[]"] [databaseCollections="{\"kg_prod\":[]}"] [async=false]
[2023/10/08 11:42:06.241 +08:00] [INFO] [storage/minio_chunk_manager.go:124] ["minio chunk manager init success."] [bucketname=cos-prod-tc-ops-milvus-1302248489] [root=milvus_data]
[2023/10/08 11:42:06.306 +08:00] [ERROR] [core/backup_impl_create_backup.go:78] ["fail to get milvus version"] [error="rpc error: code = Unimplemented desc = "] [stack="github.com/zilliztech/milvus-backup/core.(*BackupContext).CreateBackup\n\t/home/runner/work/milvus-backup/milvus-backup/core/backup_impl_create_backup.go:78\ngithub.com/zilliztech/milvus-backup/cmd.glob..func2\n\t/home/runner/work/milvus-backup/milvus-backup/cmd/create.go:57\ngithub.com/spf13/cobra.(*Command).execute\n\t/home/runner/go/pkg/mod/github.com/spf13/cobra@v1.5.0/command.go:876\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/home/runner/go/pkg/mod/github.com/spf13/cobra@v1.5.0/command.go:990\ngithub.com/spf13/cobra.(*Command).Execute\n\t/home/runner/go/pkg/mod/github.com/spf13/cobra@v1.5.0/command.go:918\ngithub.com/zilliztech/milvus-backup/cmd.Execute\n\t/home/runner/work/milvus-backup/milvus-backup/cmd/root.go:26\nmain.main\n\t/home/runner/work/milvus-backup/milvus-backup/main.go:24\nruntime.main\n\t/opt/hostedtoolcache/go/1.18.10/x64/src/runtime/proc.go:250"]
Fail 
 rpc error: code = Unimplemented desc =

Environment

centos7.9
./milvus-backup  --version
milvus-backup version 0.4.0 (Built on 2023-09-22T10:22:24Z from Git SHA 1eb59969e3c4038db678e942b723515fb88c6271)
milvus 版本:v2.3.0

Anything else?

No response

wayblink commented 9 months ago

@suchenghui hi, I can't reproduce this issue. Seems your milvus didn't implement GetVersion API properly. Can you retry and provide us milvus log?

suchenghui commented 9 months ago

[root@tc-bj-yw-dba-manager logs]# cat backup.log [2023/10/08 16:24:00.711 +08:00] [INFO] [logutil/logutil.go:165] ["Log directory"] [configDir=] [2023/10/08 16:24:00.713 +08:00] [INFO] [logutil/logutil.go:166] ["Set log file to "] [path=logs/backup.log] [2023/10/08 16:24:00.714 +08:00] [INFO] [core/backup_impl_create_backup.go:25] ["receive CreateBackupRequest"] [requestId=088e8ea2-65b4-11ee-9e96-5254000770cb] [backupName=backup_kg_prod_01] [collections="[]"] [databaseCollections="{\"kg_prod\":[]}"] [async=false] [2023/10/08 16:24:00.940 +08:00] [INFO] [storage/minio_chunk_manager.go:124] ["minio chunk manager init success."] [bucketname=cos-prod-tc-ops-milvus-1302248489] [root=milvus_data] [2023/10/08 16:24:00.988 +08:00] [ERROR] [core/backup_impl_create_backup.go:78] ["fail to get milvus version"] [error="rpc error: code = Unimplemented desc = "] [stack="github.com/zilliztech/milvus-backup/core.(BackupContext).CreateBackup\n\t/home/runner/work/milvus-backup/milvus-backup/core/backup_impl_create_backup.go:78\ngithub.com/zilliztech/milvus-backup/cmd.glob..func2\n\t/home/runner/work/milvus-backup/milvus-backup/cmd/create.go:57\ngithub.com/spf13/cobra.(Command).execute\n\t/home/runner/go/pkg/mod/github.com/spf13/cobra@v1.5.0/command.go:876\ngithub.com/spf13/cobra.(Command).ExecuteC\n\t/home/runner/go/pkg/mod/github.com/spf13/cobra@v1.5.0/command.go:990\ngithub.com/spf13/cobra.(Command).Execute\n\t/home/runner/go/pkg/mod/github.com/spf13/cobra@v1.5.0/command.go:918\ngithub.com/zilliztech/milvus-backup/cmd.Execute\n\t/home/runner/work/milvus-backup/milvus-backup/cmd/root.go:26\nmain.main\n\t/home/runner/work/milvus-backup/milvus-backup/main.go:24\nruntime.main\n\t/opt/hostedtoolcache/go/1.18.10/x64/src/runtime/proc.go:250"] [root@tc-bj-yw-dba-manager logs]# @wayblink

wayblink commented 9 months ago

[root@tc-bj-yw-dba-manager logs]# cat backup.log [2023/10/08 16:24:00.711 +08:00] [INFO] [logutil/logutil.go:165] ["Log directory"] [configDir=] [2023/10/08 16:24:00.713 +08:00] [INFO] [logutil/logutil.go:166] ["Set log file to "] [path=logs/backup.log] [2023/10/08 16:24:00.714 +08:00] [INFO] [core/backup_impl_create_backup.go:25] ["receive CreateBackupRequest"] [requestId=088e8ea2-65b4-11ee-9e96-5254000770cb] [backupName=backup_kg_prod_01] [collections="[]"] [databaseCollections="{"kg_prod":[]}"] [async=false] [2023/10/08 16:24:00.940 +08:00] [INFO] [storage/minio_chunk_manager.go:124] ["minio chunk manager init success."] [bucketname=cos-prod-tc-ops-milvus-1302248489] [root=milvus_data] [2023/10/08 16:24:00.988 +08:00] [ERROR] [core/backup_impl_create_backup.go:78] ["fail to get milvus version"] [error="rpc error: code = Unimplemented desc = "] [stack="github.com/zilliztech/milvus-backup/core.(BackupContext).CreateBackup\n\t/home/runner/work/milvus-backup/milvus-backup/core/backup_impl_create_backup.go:78\ngithub.com/zilliztech/milvus-backup/cmd.glob..func2\n\t/home/runner/work/milvus-backup/milvus-backup/cmd/create.go:57\ngithub.com/spf13/cobra.(Command).execute\n\t/home/runner/go/pkg/mod/github.com/spf13/cobra@v1.5.0/command.go:876\ngithub.com/spf13/cobra.(Command).ExecuteC\n\t/home/runner/go/pkg/mod/github.com/spf13/cobra@v1.5.0/command.go:990\ngithub.com/spf13/cobra.(Command).Execute\n\t/home/runner/go/pkg/mod/github.com/spf13/cobra@v1.5.0/command.go:918\ngithub.com/zilliztech/milvus-backup/cmd.Execute\n\t/home/runner/work/milvus-backup/milvus-backup/cmd/root.go:26\nmain.main\n\t/home/runner/work/milvus-backup/milvus-backup/main.go:24\nruntime.main\n\t/opt/hostedtoolcache/go/1.18.10/x64/src/runtime/proc.go:250"] [root@tc-bj-yw-dba-manager logs]# @wayblink

milvus log. not milvus-backup log

suchenghui commented 9 months ago

[godman@tc-bj-yw-dba-manager logs]$ tail -n 500 proxy-18.log |grep error {"level":"WARN","time":"2023/10/09 02:28:35.253 +00:00","caller":"grpclog/grpclog.go:46","message":"[core][Channel #4 SubChannel #5] grpc: addrConn.createTransport failed to connect to {\n \"Addr\": \"localhost:2379\",\n \"ServerName\": \"localhost\",\n \"Attributes\": null,\n \"BalancerAttributes\": null,\n \"Type\": 0,\n \"Metadata\": null\n}. Err: connection error: desc = \"transport: Error while dialing: dial tcp [::1]:2379: connect: connection refused\""} {"level":"WARN","time":"2023/10/09 02:30:35.866 +00:00","caller":"grpclog/grpclog.go:46","message":"[core][Channel #4 SubChannel #5] grpc: addrConn.createTransport failed to connect to {\n \"Addr\": \"localhost:2379\",\n \"ServerName\": \"localhost\",\n \"Attributes\": null,\n \"BalancerAttributes\": null,\n \"Type\": 0,\n \"Metadata\": null\n}. Err: connection error: desc = \"transport: Error while dialing: dial tcp [::1]:2379: connect: connection refused\""} {"level":"WARN","time":"2023/10/09 02:32:43.986 +00:00","caller":"grpclog/grpclog.go:46","message":"[core][Channel #4 SubChannel #5] grpc: addrConn.createTransport failed to connect to {\n \"Addr\": \"localhost:2379\",\n \"ServerName\": \"localhost\",\n \"Attributes\": null,\n \"BalancerAttributes\": null,\n \"Type\": 0,\n \"Metadata\": null\n}. Err: connection error: desc = \"transport: Error while dialing: dial tcp [::1]:2379: connect: connection refused\""} {"level":"WARN","time":"2023/10/09 02:34:56.161 +00:00","caller":"grpclog/grpclog.go:46","message":"[core][Channel #4 SubChannel #5] grpc: addrConn.createTransport failed to connect to {\n \"Addr\": \"localhost:2379\",\n \"ServerName\": \"localhost\",\n \"Attributes\": null,\n \"BalancerAttributes\": null,\n \"Type\": 0,\n \"Metadata\": null\n}. Err: connection error: desc = \"transport: Error while dialing: dial tcp [::1]:2379: connect: connection refused\""} {"level":"WARN","time":"2023/10/09 02:36:42.635 +00:00","caller":"grpclog/grpclog.go:46","message":"[core][Channel #4 SubChannel #5] grpc: addrConn.createTransport failed to connect to {\n \"Addr\": \"localhost:2379\",\n \"ServerName\": \"localhost\",\n \"Attributes\": null,\n \"BalancerAttributes\": null,\n \"Type\": 0,\n \"Metadata\": null\n}. Err: connection error: desc = \"transport: Error while dialing: dial tcp [::1]:2379: connect: connection refused\""} {"level":"WARN","time":"2023/10/09 02:38:42.888 +00:00","caller":"grpclog/grpclog.go:46","message":"[core][Channel #4 SubChannel #5] grpc: addrConn.createTransport failed to connect to {\n \"Addr\": \"localhost:2379\",\n \"ServerName\": \"localhost\",\n \"Attributes\": null,\n \"BalancerAttributes\": null,\n \"Type\": 0,\n \"Metadata\": null\n}. Err: connection error: desc = \"transport: Error while dialing: dial tcp [::1]:2379: connect: connection refused\""} {"level":"WARN","time":"2023/10/09 02:40:38.555 +00:00","caller":"grpclog/grpclog.go:46","message":"[core][Channel #4 SubChannel #5] grpc: addrConn.createTransport failed to connect to {\n \"Addr\": \"localhost:2379\",\n \"ServerName\": \"localhost\",\n \"Attributes\": null,\n \"BalancerAttributes\": null,\n \"Type\": 0,\n \"Metadata\": null\n}. Err: connection error: desc = \"transport: Error while dialing: dial tcp [::1]:2379: connect: connection refused\""} {"level":"WARN","time":"2023/10/09 02:41:33.128 +00:00","caller":"funcutil/policy.go:42","message":"GetExtension fail","error":"proto: not an extendable proto.Message"} {"level":"INFO","time":"2023/10/09 02:41:33.128 +00:00","caller":"proxy/privilege_interceptor.go:72","message":"GetPrivilegeExtObj err","error":"proto: not an extendable proto.Message"} {"level":"WARN","time":"2023/10/09 02:41:33.133 +00:00","caller":"funcutil/policy.go:42","message":"GetExtension fail","error":"proto: not an extendable proto.Message"} {"level":"INFO","time":"2023/10/09 02:41:33.133 +00:00","caller":"proxy/privilege_interceptor.go:72","message":"GetPrivilegeExtObj err","error":"proto: not an extendable proto.Message"} {"level":"WARN","time":"2023/10/09 02:41:33.260 +00:00","caller":"funcutil/policy.go:42","message":"GetExtension fail","error":"proto: not an extendable proto.Message"} {"level":"INFO","time":"2023/10/09 02:41:33.260 +00:00","caller":"proxy/privilege_interceptor.go:72","message":"GetPrivilegeExtObj err","error":"proto: not an extendable proto.Message"} {"level":"WARN","time":"2023/10/09 02:42:54.243 +00:00","caller":"grpclog/grpclog.go:46","message":"[core][Channel #4 SubChannel #5] grpc: addrConn.createTransport failed to connect to {\n \"Addr\": \"localhost:2379\",\n \"ServerName\": \"localhost\",\n \"Attributes\": null,\n \"BalancerAttributes\": null,\n \"Type\": 0,\n \"Metadata\": null\n}. Err: connection error: desc = \"transport: Error while dialing: dial tcp [::1]:2379: connect: connection refused\""} {"level":"WARN","time":"2023/10/09 02:44:42.356 +00:00","caller":"grpclog/grpclog.go:46","message":"[core][Channel #4 SubChannel #5] grpc: addrConn.createTransport failed to connect to {\n \"Addr\": \"localhost:2379\",\n \"ServerName\": \"localhost\",\n \"Attributes\": null,\n \"BalancerAttributes\": null,\n \"Type\": 0,\n \"Metadata\": null\n}. Err: connection error: desc = \"transport: Error while dialing: dial tcp [::1]:2379: connect: connection refused\""} {"level":"WARN","time":"2023/10/09 02:46:22.281 +00:00","caller":"grpclog/grpclog.go:46","message":"[core][Channel #4 SubChannel #5] grpc: addrConn.createTransport failed to connect to {\n \"Addr\": \"localhost:2379\",\n \"ServerName\": \"localhost\",\n \"Attributes\": null,\n \"BalancerAttributes\": null,\n \"Type\": 0,\n \"Metadata\": null\n}. Err: connection error: desc = \"transport: Error while dialing: dial tcp [::1]:2379: connect: connection refused\""} {"level":"WARN","time":"2023/10/09 02:48:33.777 +00:00","caller":"grpclog/grpclog.go:46","message":"[core][Channel #4 SubChannel #5] grpc: addrConn.createTransport failed to connect to {\n \"Addr\": \"localhost:2379\",\n \"ServerName\": \"localhost\",\n \"Attributes\": null,\n \"BalancerAttributes\": null,\n \"Type\": 0,\n \"Metadata\": null\n}. Err: connection error: desc = \"transport: Error while dialing: dial tcp [::1]:2379: connect: connection refused\""} {"level":"WARN","time":"2023/10/09 02:50:33.364 +00:00","caller":"grpclog/grpclog.go:46","message":"[core][Channel #4 SubChannel #5] grpc: addrConn.createTransport failed to connect to {\n \"Addr\": \"localhost:2379\",\n \"ServerName\": \"localhost\",\n \"Attributes\": null,\n \"BalancerAttributes\": null,\n \"Type\": 0,\n \"Metadata\": null\n}. Err: connection error: desc = \"transport: Error while dialing: dial tcp [::1]:2379: connect: connection refused\""} {"level":"WARN","time":"2023/10/09 02:52:39.654 +00:00","caller":"grpclog/grpclog.go:46","message":"[core][Channel #4 SubChannel #5] grpc: addrConn.createTransport failed to connect to {\n \"Addr\": \"localhost:2379\",\n \"ServerName\": \"localhost\",\n \"Attributes\": null,\n \"BalancerAttributes\": null,\n \"Type\": 0,\n \"Metadata\": null\n}. Err: connection error: desc = \"transport: Error while dialing: dial tcp [::1]:2379: connect: connection refused\""} {"level":"WARN","time":"2023/10/09 02:54:30.505 +00:00","caller":"grpclog/grpclog.go:46","message":"[core][Channel #4 SubChannel #5] grpc: addrConn.createTransport failed to connect to {\n \"Addr\": \"localhost:2379\",\n \"ServerName\": \"localhost\",\n \"Attributes\": null,\n \"BalancerAttributes\": null,\n \"Type\": 0,\n \"Metadata\": null\n}. Err: connection error: desc = \"transport: Error while dialing: dial tcp [::1]:2379: connect: connection refused\""} {"level":"WARN","time":"2023/10/09 02:56:32.454 +00:00","caller":"grpclog/grpclog.go:46","message":"[core][Channel #4 SubChannel #5] grpc: addrConn.createTransport failed to connect to {\n \"Addr\": \"localhost:2379\",\n \"ServerName\": \"localhost\",\n \"Attributes\": null,\n \"BalancerAttributes\": null,\n \"Type\": 0,\n \"Metadata\": null\n}. Err: connection error: desc = \"transport: Error while dialing: dial tcp [::1]:2379: connect: connection refused\""} {"level":"WARN","time":"2023/10/09 02:58:25.334 +00:00","caller":"grpclog/grpclog.go:46","message":"[core][Channel #4 SubChannel #5] grpc: addrConn.createTransport failed to connect to {\n \"Addr\": \"localhost:2379\",\n \"ServerName\": \"localhost\",\n \"Attributes\": null,\n \"BalancerAttributes\": null,\n \"Type\": 0,\n \"Metadata\": null\n}. Err: connection error: desc = \"transport: Error while dialing: dial tcp [::1]:2379: connect: connection refused\""} {"level":"WARN","time":"2023/10/09 03:00:20.526 +00:00","caller":"grpclog/grpclog.go:46","message":"[core][Channel #4 SubChannel #5] grpc: addrConn.createTransport failed to connect to {\n \"Addr\": \"localhost:2379\",\n \"ServerName\": \"localhost\",\n \"Attributes\": null,\n \"BalancerAttributes\": null,\n \"Type\": 0,\n \"Metadata\": null\n}. Err: connection error: desc = \"transport: Error while dialing: dial tcp [::1]:2379: connect: connection refused\""} {"level":"WARN","time":"2023/10/09 03:02:20.204 +00:00","caller":"grpclog/grpclog.go:46","message":"[core][Channel #4 SubChannel #5] grpc: addrConn.createTransport failed to connect to {\n \"Addr\": \"localhost:2379\",\n \"ServerName\": \"localhost\",\n \"Attributes\": null,\n \"BalancerAttributes\": null,\n \"Type\": 0,\n \"Metadata\": null\n}. Err: connection error: desc = \"transport: Error while dialing: dial tcp [::1]:2379: connect: connection refused\""} {"level":"WARN","time":"2023/10/09 03:04:35.157 +00:00","caller":"grpclog/grpclog.go:46","message":"[core][Channel #4 SubChannel #5] grpc: addrConn.createTransport failed to connect to {\n \"Addr\": \"localhost:2379\",\n \"ServerName\": \"localhost\",\n \"Attributes\": null,\n \"BalancerAttributes\": null,\n \"Type\": 0,\n \"Metadata\": null\n}. Err: connection error: desc = \"transport: Error while dialing: dial tcp [::1]:2379: connect: connection refused\""} {"level":"WARN","time":"2023/10/09 03:06:30.739 +00:00","caller":"grpclog/grpclog.go:46","message":"[core][Channel #4 SubChannel #5] grpc: addrConn.createTransport failed to connect to {\n \"Addr\": \"localhost:2379\",\n \"ServerName\": \"localhost\",\n \"Attributes\": null,\n \"BalancerAttributes\": null,\n \"Type\": 0,\n \"Metadata\": null\n}. Err: connection error: desc = \"transport: Error while dialing: dial tcp [::1]:2379: connect: connection refused\""} @wayblink

suchenghui commented 9 months ago

milvus - log 提取码: milv @wayblink

wayblink commented 9 months ago

@suchenghui There is no clue in log. I tried again with milvus v2.3.0 and milvus-backup 0.4.0, not reproduced. How you develop your milvus cluster? Is that the official release version v2.3.0 https://github.com/milvus-io/milvus/releases/tag/v2.3.0 ?

suchenghui commented 9 months ago

https://milvus.io/docs/install_offline-helm.md install for offline milvus v2.3.0 @wayblink K8S verion 1.22