Closed chenjacken closed 8 months ago
host的POD日志是:
[info 2024-01-30 10:12:43 remotefile.(*SRemoteFile).downloadInternal.func1(remotefile.go:263)] written file /opt/cloud/workspace/disks/image_cache/118f76a6-cfb2-49a8-892c-eee6136b234c.tmp rate: 10.34 MiB p/s percent: 99.78%
[warning 2024-01-30 10:12:43 appsrv.do_worker_watchdog(workers_watchdog.go:64)] WorkerManager RequestWorker has been busy for 49 cycles...
[info 2024-01-30 10:12:44 remotefile.(*SRemoteFile).downloadInternal.func1(remotefile.go:263)] written file /opt/cloud/workspace/disks/image_cache/118f76a6-cfb2-49a8-892c-eee6136b234c.tmp rate: 10.47 MiB p/s percent: 99.85%
[info 2024-01-30 10:12:45 remotefile.(*SRemoteFile).downloadInternal.func1(remotefile.go:263)] written file /opt/cloud/workspace/disks/image_cache/118f76a6-cfb2-49a8-892c-eee6136b234c.tmp rate: 10.52 MiB p/s percent: 99.92%
[info 2024-01-30 10:12:46 remotefile.(*SRemoteFile).downloadInternal.func1(remotefile.go:263)] written file /opt/cloud/workspace/disks/image_cache/118f76a6-cfb2-49a8-892c-eee6136b234c.tmp rate: 10.01 MiB p/s percent: 99.99%
[warning 2024-01-30 10:13:13 appsrv.do_worker_watchdog(workers_watchdog.go:64)] WorkerManager RequestWorker has been busy for 50 cycles...
[info 2024-01-30 10:13:25 storageman.(*SRbdImageCache).Acquire(imagecache_rbd.go:86)] convert local image 118f76a6-cfb2-49a8-892c-eee6136b234c to rbd pool nvmepool
[warning 2024-01-30 10:13:43 appsrv.do_worker_watchdog(workers_watchdog.go:64)] WorkerManager RequestWorker has been busy for 51 cycles...
[warning 2024-01-30 10:14:13 appsrv.do_worker_watchdog(workers_watchdog.go:64)] WorkerManager RequestWorker has been busy for 52 cycles...
[warning 2024-01-30 10:14:43 appsrv.do_worker_watchdog(workers_watchdog.go:64)] WorkerManager RequestWorker has been busy for 53 cycles...
[warning 2024-01-30 10:15:13 appsrv.do_worker_watchdog(workers_watchdog.go:64)] WorkerManager RequestWorker has been busy for 54 cycles...
[warning 2024-01-30 10:15:43 appsrv.do_worker_watchdog(workers_watchdog.go:64)] WorkerManager RequestWorker has been busy for 55 cycles...
[warning 2024-01-30 10:16:13 appsrv.do_worker_watchdog(workers_watchdog.go:64)] WorkerManager RequestWorker has been busy for 56 cycles...
[info 2024-01-30 10:16:35 workmanager.(*workerTask).Run(manager.go:95)] DelayTask complete: {"image_id":"118f76a6-cfb2-49a8-892c-eee6136b234c","name":"yudao1-2","path":"rbd:nvmepool/image_cache_118f76a6-cfb2-49a8-892c-eee6136b234c:mon_host=172.16.1.216\\;172.16.1.218\\;172.16.1.217:key=AQBz5pZlFX41OBAAJqPwV73/Zxc0nKEjdGb0uw\\=\\=:rados_mon_op_timeout=5:rados_osd_op_timeout=1200:client_mount_timeout=120","size":204800}
[info 2024-01-30 10:16:35 modules.TaskComplete(task.go:34)] Sync task 2a91051f-78c6-4eab-867a-16017fa34e73 complete succ
[info 2024-01-30 12:52:41 appsrv.(*Application).ServeHTTP(appsrv.go:288)] S-ox38NQdlc88DjrcA5pl0ksZmU= 200 77d344-5b692f-1c4972 GET /servers/289287d0-4d9c-4605-89f7-69dd878143a9/status (172.16.1.211:1842:compute_v2) 59.03ms
[info 2024-01-30 12:52:41 modules.TaskComplete(task.go:34)] Sync task 27bae067-18ca-4259-88bb-25e35dcb4674 complete succ
[info 2024-01-30 12:52:44 appsrv.(*Application).ServeHTTP(appsrv.go:288)] S-ox38NQdlc88DjrcA5pl0ksZmU= 200 ed08cd-591031-7631dd GET /servers/289287d0-4d9c-4605-89f7-69dd878143a9/status (172.16.1.211:60917:compute_v2) 0.20ms
[info 2024-01-30 12:52:44 modules.TaskComplete(task.go:34)] Sync task cc842b60-bac9-4147-8059-335a69fb9509 complete succ
[info 2024-01-30 12:52:50 appsrv.(*Application).ServeHTTP(appsrv.go:288)] S-ox38NQdlc88DjrcA5pl0ksZmU= 200 498edf-efab3f-3424b3 GET /servers/289287d0-4d9c-4605-89f7-69dd878143a9/status (172.16.1.211:18294:compute_v2) 0.11ms
[info 2024-01-30 12:52:50 modules.TaskComplete(task.go:34)] Sync task 23eec0f6-991d-4e63-8aa3-f2ee77d6b8ef complete succ
[info 2024-01-30 12:52:51 appsrv.(*Application).ServeHTTP(appsrv.go:288)] S-ox38NQdlc88DjrcA5pl0ksZmU= 200 3ded0b-7a0546-3475df GET /servers/289287d0-4d9c-4605-89f7-69dd878143a9/status (172.16.1.211:55389:compute_v2) 0.21ms
[info 2024-01-30 12:52:51 modules.TaskComplete(task.go:34)] Sync task aeaeb207-bdcc-400f-8746-3e9d11e91e27 complete succ
[info 2024-01-30 12:52:56 appsrv.(*Application).ServeHTTP(appsrv.go:288)] S-ox38NQdlc88DjrcA5pl0ksZmU= 200 a054b6-acf548-35de33 GET /servers/289287d0-4d9c-4605-89f7-69dd878143a9/status (172.16.1.211:48940:compute_v2) 0.22ms
[info 2024-01-30 12:52:56 modules.TaskComplete(task.go:34)] Sync task 1d6c8b78-17e6-4ef8-8636-f9fb048296b9 complete succ
[info 2024-01-30 12:52:59 appsrv.(*Application).ServeHTTP(appsrv.go:288)] S-ox38NQdlc88DjrcA5pl0ksZmU= 200 c0a834-436505-59c89f POST /servers/289287d0-4d9c-4605-89f7-69dd878143a9/start (172.16.1.211:20318:compute_v2) 3.46ms
[info 2024-01-30 12:52:59 guestman.(*SKVMGuestInstance).asyncScriptStart(qemu-kvm.go:580)] Use vnc port 1
[error 2024-01-30 12:53:01 guestman.(*SKVMGuestInstance).StartMonitor(qemu-kvm.go:824)] Guest 289287d0-4d9c-4605-89f7-69dd878143a9 start monitor failed, can't get qmp monitor port or monitor path
[error 2024-01-30 12:53:01 guestman.(*SKVMGuestInstance).StartMonitor(qemu-kvm.go:824)] Guest 289287d0-4d9c-4605-89f7-69dd878143a9 start monitor failed, can't get qmp monitor port or monitor path
[info 2024-01-30 12:53:01 monitor.(*SBaseMonitor).connect(monitor.go:298)] Connect tcp 127.0.0.1:56101 success
[info 2024-01-30 12:53:01 guestman.(*SKVMGuestInstance).asyncScriptStart(qemu-kvm.go:605)] VM started yudao2(289287d0-4d9c-4605-89f7-69dd878143a9) ...
[info 2024-01-30 12:53:01 guestman.(*SKVMGuestInstance).asyncScriptStart(qemu-kvm.go:611)] Async start server yudao2(289287d0-4d9c-4605-89f7-69dd878143a9) success!
[info 2024-01-30 12:53:03 monitor.(*QmpMonitor).read(qmp.go:182)] QMP Read yudao2(289287d0-4d9c-4605-89f7-69dd878143a9): {"QMP": {"version": {"qemu": {"micro": 0, "minor": 2, "major": 4}, "package": "2022-12-15_14:23:05@buildkitsandbox@e2220a9"}, "capabilities": ["oob"]}}
[info 2024-01-30 12:53:03 guestman.(*SKVMGuestInstance).onMonitorConnected(qemu-kvm.go:1016)] Monitor connected ...
[info 2024-01-30 12:53:03 monitor.(*QmpMonitor).write(qmp.go:260)] QMP Write yudao2(289287d0-4d9c-4605-89f7-69dd878143a9): {"execute":"qmp_capabilities"}
[info 2024-01-30 12:53:03 monitor.(*QmpMonitor).write(qmp.go:260)] QMP Write yudao2(289287d0-4d9c-4605-89f7-69dd878143a9): {"execute":"query-version"}
[info 2024-01-30 12:53:03 monitor.(*QmpMonitor).read(qmp.go:182)] QMP Read yudao2(289287d0-4d9c-4605-89f7-69dd878143a9): {"return": {}}
[info 2024-01-30 12:53:03 monitor.(*QmpMonitor).read(qmp.go:182)] QMP Read yudao2(289287d0-4d9c-4605-89f7-69dd878143a9): {"return": {"qemu": {"micro": 0, "minor": 2, "major": 4}, "package": "2022-12-15_14:23:05@buildkitsandbox@e2220a9"}}
[info 2024-01-30 12:53:03 guestman.(*SKVMGuestInstance).onGetQemuVersion(qemu-kvm.go:1086)] Guest(289287d0-4d9c-4605-89f7-69dd878143a9) qemu version 4.2.0
[info 2024-01-30 12:53:03 monitor.(*QmpMonitor).write(qmp.go:260)] QMP Write yudao2(289287d0-4d9c-4605-89f7-69dd878143a9): {"execute":"human-monitor-command","arguments":{"command-line":"info status"}}
[info 2024-01-30 12:53:03 monitor.(*QmpMonitor).read(qmp.go:182)] QMP Read yudao2(289287d0-4d9c-4605-89f7-69dd878143a9): {"return": "VM status: paused (prelaunch)\r\n"}
[info 2024-01-30 12:53:03 guestman.(*SGuestResumeTask).onConfirmRunning(guesttasks.go:1536)]289287d0-4d9c-4605-89f7-69dd878143a9: onConfirmRunning status paused (prelaunch)
[error 2024-01-30 12:53:03 cgrouputils.(*CGroupTask).createTask(cgrouputils.go:236)] mkdir /sys/fs/cgroup/memory/cloudpods.hostagent/server_289287d0-4d9c-4605-89f7-69dd878143a9_20338: no such file or directory
[info 2024-01-30 12:53:03 monitor.(*QmpMonitor).write(qmp.go:260)] QMP Write yudao2(289287d0-4d9c-4605-89f7-69dd878143a9): {"execute":"cont"}
[info 2024-01-30 12:53:03 monitor.(*QmpMonitor).write(qmp.go:260)] QMP Write yudao2(289287d0-4d9c-4605-89f7-69dd878143a9): {"execute":"human-monitor-command","arguments":{"command-line":"block_set_io_throttle drive_0 0 0 0 0 0 0"}}
[info 2024-01-30 12:53:03 monitor.(*QmpMonitor).read(qmp.go:182)] QMP Read yudao2(289287d0-4d9c-4605-89f7-69dd878143a9): {"timestamp": {"seconds": 1706619183, "microseconds": 247259}, "event": "RESUME"}
[info 2024-01-30 12:53:03 monitor.(*QmpMonitor).watchEvent(qmp.go:252)] QMP event yudao2(289287d0-4d9c-4605-89f7-69dd878143a9): QMP Event result: &monitor.Event{Event:"\"RESUME\"", Data:map[string]interface {}{}, Timestamp:(*monitor.Timestamp)(0xc001abbbe0)}
[info 2024-01-30 12:53:03 monitor.(*QmpMonitor).read(qmp.go:182)] QMP Read yudao2(289287d0-4d9c-4605-89f7-69dd878143a9): {"return": {}}
[info 2024-01-30 12:53:03 monitor.(*QmpMonitor).write(qmp.go:260)] QMP Write yudao2(289287d0-4d9c-4605-89f7-69dd878143a9): {"execute":"human-monitor-command","arguments":{"command-line":"info status"}}
[info 2024-01-30 12:53:03 monitor.(*QmpMonitor).read(qmp.go:182)] QMP Read yudao2(289287d0-4d9c-4605-89f7-69dd878143a9): {"return": ""}
[info 2024-01-30 12:53:03 monitor.(*QmpMonitor).read(qmp.go:182)] QMP Read yudao2(289287d0-4d9c-4605-89f7-69dd878143a9): {"return": "VM status: running\r\n"}
[info 2024-01-30 12:53:03 guestman.(*SGuestResumeTask).onConfirmRunning(guesttasks.go:1536)]289287d0-4d9c-4605-89f7-69dd878143a9: onConfirmRunning status running
[info 2024-01-30 12:53:03 modules.TaskComplete(task.go:34)] Sync task dd3140a1-4e5c-4447-8001-b58a3dca8824 complete succ
[info 2024-01-30 12:53:03 guestman.(*SKVMGuestInstance).detachStartupTask(qemu-kvm.go:1511)]289287d0-4d9c-4605-89f7-69dd878143a9: detachStartupTask
[info 2024-01-30 12:53:03 monitor.(*QmpMonitor).write(qmp.go:260)] QMP Write yudao2(289287d0-4d9c-4605-89f7-69dd878143a9): {"execute":"query-block-jobs"}
[info 2024-01-30 12:53:03 monitor.(*QmpMonitor).read(qmp.go:182)] QMP Read yudao2(289287d0-4d9c-4605-89f7-69dd878143a9): {"return": []}
[info 2024-01-30 12:53:06 monitor.(*QmpMonitor).write(qmp.go:260)] QMP Write yudao2(289287d0-4d9c-4605-89f7-69dd878143a9): {"execute":"set_password","arguments":{"password":"tSTd2z8b","protocol":"vnc"}}
[info 2024-01-30 12:53:06 monitor.(*QmpMonitor).read(qmp.go:182)] QMP Read yudao2(289287d0-4d9c-4605-89f7-69dd878143a9): {"return": {}}
[info 2024-01-30 12:53:32 monitor.(*QmpMonitor).read(qmp.go:182)] QMP Read yudao2(289287d0-4d9c-4605-89f7-69dd878143a9): {"timestamp": {"seconds": 1706619212, "microseconds": 599381}, "event": "NIC_RX_FILTER_CHANGED", "data": {"name": "netdev-static-92", "path": "/machine/peripheral/netdev-static-92/virtio-backend"}}
[info 2024-01-30 12:53:32 monitor.(*QmpMonitor).watchEvent(qmp.go:252)] QMP event yudao2(289287d0-4d9c-4605-89f7-69dd878143a9): QMP Event result: &monitor.Event{Event:"\"NIC_RX_FILTER_CHANGED\"", Data:map[string]interface {}{"name":"netdev-static-92", "path":"/machine/peripheral/netdev-static-92/virtio-backend"}, Timestamp:(*monitor.Timestamp)(0xc0020b0040)}
[info 2024-01-30 12:53:32 hostdhcp.(*SGuestDHCPServer).serveDHCPInternal(dhcpserver.go:278)] Make DHCP Reply 172.16.1.92 TO 00:22:a3:f2:b5:5d
[root@master1 ~]# kubectl logs default-host-ctsxr -c host -n onecloud |grep error
[error 2024-01-29 13:34:45 fileutils2.GetAllBlkdevsIoSchedulers(fileutils.go:170)] no block device avaiable
[error 2024-01-29 13:34:54 guestman.(*SGuestManager).OnVerifyExistingGuestsSucc(guestman.go:295)] Server CDN-Node-GZ02(b00d5297-0b52-4bcb-8d68-d722ab1ec713) not found on this host
[error 2024-01-29 13:34:54 guestman.(*SGuestManager).OnVerifyExistingGuestsSucc(guestman.go:295)] Server kubernetes-node-mrgt-3(b654eb65-7fa2-4eba-8d72-2bb56a72a3d3) not found on this host
[error 2024-01-29 13:34:54 guestman.(*SGuestManager).OnVerifyExistingGuestsSucc(guestman.go:295)] Server makers-2(13a296d5-4629-4ff1-8114-0f72bfe683f2) not found on this host
[error 2024-01-29 13:34:54 hostinfo.(*SHostInfo).PutHostOnline(hostinfo.go:1552)] Host sys error: map[isolated_devices:[{isolated_devices GPU 03:00.0 use kernel driver ast, skip it 2024-01-29 13:34:49.750245895 +0000 UTC m=+8.038013814}]]
[error 2024-01-29 13:34:54 httperrors.HTTPError(httperrors.go:110)] Send error Guest 13a296d5-4629-4ff1-8114-0f72bfe683f2 not found
[error 2024-01-29 13:34:54 httperrors.HTTPError(httperrors.go:110)] Send error Guest b654eb65-7fa2-4eba-8d72-2bb56a72a3d3 not found
[error 2024-01-29 13:34:54 httperrors.HTTPError(httperrors.go:110)] Send error Guest b00d5297-0b52-4bcb-8d68-d722ab1ec713 not found
[error 2024-01-30 07:23:21 httperrors.HTTPError(httperrors.go:110)] Send error Guest 20ad2895-45e6-41b1-8e87-3842445990e8 not found
[error 2024-01-30 07:23:21 httperrors.HTTPError(httperrors.go:110)] Send error Guest 20ad2895-45e6-41b1-8e87-3842445990e8 not found
[error 2024-01-30 07:23:30 httperrors.HTTPError(httperrors.go:110)] Send error Guest 20ad2895-45e6-41b1-8e87-3842445990e8 not found
[error 2024-01-30 07:23:30 httperrors.HTTPError(httperrors.go:110)] Send error Guest 20ad2895-45e6-41b1-8e87-3842445990e8 not found
[error 2024-01-30 07:23:39 httperrors.HTTPError(httperrors.go:110)] Send error Not found
[info 2024-01-30 09:13:20 workmanager.(*workerTask).Run(manager.go:92)] DelayTask failed: Deploy guest fs: request deploy guest fs: rpc error: code = Unknown desc = run deploy_guest_fs failed []: "/opt/yunion/bin/host-deployer --common-config-file /opt/yunion/common.conf --config /opt/yunion/host.conf --deploy-action deploy_guest_fs --deploy-params '{\"disk_info\":{\"path\":\"rbd:nvmepool/058e45ea-71d9-4338-8629-12b21389f028:mon_host=172.16.1.216\\\\;172.16.1.218\\\\;172.16.1.217:key=AQBz5pZlFX41OBAAJqPwV73/Zxc0nKEjdGb0uw\\\\=\\\\=:rados_osd_op_timeout=1200:client_mount_timeout=120:rados_mon_op_timeout=5\"},\"guest_desc\":{\"name\":\"yudao2\",\"uuid\":\"289287d0-4d9c-4605-89f7-69dd878143a9\",\"domain\":\"cloud.onecloud.io\",\"nics\":[{\"mac\":\"00:22:a3:f2:b5:5d\",\"ip\":\"172.16.1.92\",\"net\":\"static\",\"net_id\":\"ce676c71-febf-4ca1-8ecf-6add3aa5215e\",\"gateway\":\"172.16.1.1\",\"dns\":\"172.16.1.200\",\"domain\":\"cloud.onecloud.io\",\"ifname\":\"static-92\",\"masklen\":24,\"driver\":\"virtio\",\"bridge\":\"br1\",\"wire_id\":\"2a4e5367-e4c5-4410-81dd-217698d99ff2\",\"vlan\":1,\"interface\":\"bond0\",\"bw\":1000,\"mtu\":1500}],\"disks\":[{\"disk_id\":\"058e45ea-71d9-4338-8629-12b21389f028\",\"driver\":\"scsi\",\"cache_mode\":\"none\",\"aio_mode\":\"native\",\"size\":51200,\"template_id\":\"27ccd685-aab0-4498-8042-368c3d6f8d7b\",\"storage_id\":\"1b298235-a82f-4579-8b7a-e6dd2d9916d3\",\"path\":\"rbd:nvmepool/058e45ea-71d9-4338-8629-12b21389f028\",\"format\":\"raw\"}],\"Hypervisor\":\"kvm\",\"hostname\":\"yudao2\"},\"deploy_info\":{\"public_key\":{\"admin_public_key\":\"ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDG7U+zsDlTXjbDWg4/C0NElAGPJ2CXrs8dh89ftJFjPbB5W9ghrVoen4UTBBm6GqXc4hl5zGVM2zL2H31n85HfYgBo47uKFEKu9c4DpSdiTBf15zBEvhNZziOJ0FEhwglZ1WRvSKDd2+3AH23WMp++btcz/ruhbib2mdUW9nwfQj783Sl+WfJ9Ss6p3RthRtolDxrpSXAIP5KH41jwYvCLPMLBndh5sz3fHuB6AfpbjYgG++pBrhf0rtemj5f1ZtgbvQ5IlYs5L1QUcctA6BbzwlRPbaNvSaM6+hjiU3g7Fm68qmT+4uNBRVKqip0hBkMBJSW8A8ZUSLIvP4G4DDXF\\n\",\"project_public_key\":\"ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDQxBHbbAyqBKf71sa4+xLV/9gTkZe7kIJgSyU+9ViGqfzN9B0TjBqL4pnZujHUl4Gch4EK9TGg3FtQNWTBHETRMaB4JVrjSpu4uXEYRj3EVVqJKCwwWNOoy4hj7eHmEaAFkw8CVNvBlJAPFXVXUIcPZplQQQI/Da5gUfZ8beGIlrhBWtz2Julw/5sxPiaENm2PPItiw6iZnPZ88/bZCvSHy0Cx2odZE3TJrN3H5Zob/3O09n8wCqPUrvMz9ibKb9z5iT0ANLnKtSCQW1xxIml5JlSFLEPPKFEyCdrE2mTsfPp7Gc+BUD9/KZy+8hih6gfS+dL1kK6OPOVfJLxDcjNJ\\n\"},\"is_init\":true,\"default_root_user\":true,\"windows_default_admin_user\":true,\"telegraf\":{\"telegraf_conf\":\"### MANAGED BY ansible-telegraf ANSIBLE ROLE ###\\n\\n[global_tags]\\n\\n host = \\\"node6-172-16-1-219\\\"\\n vm_id = \\\"289287d0-4d9c-4605-89f7-69dd878143a9\\\"\\n zone = \\\"华南-广州\\\"\\n tenant_id = \\\"2e152fe0619046a38081d7e487028358\\\"\\n scaling_group_id = \\\"\\\"\\n host_id = \\\"3605ec3c-d819-4bd4-8c7f-3f1188e808ac\\\"\\n vm_name = \\\"yudao2\\\"\\n zone_ext_id = \\\"\\\"\\n tenant = \\\"system\\\"\\n project_domain = \\\"Default\\\"\\n os_type = \\\"Linux\\\"\\n status = \\\"start_deploy\\\"\\n cloudregion = \\\"Default\\\"\\n cloudregion_id = \\\"default\\\"\\n region_ext_id = \\\"\\\"\\n vm_ip = \\\"172.16.1.92\\\"\\n zone_id = \\\"7b6ae896-1b3d-40e5-879f-cfd00799200b\\\"\\n brand = \\\"OneCloud\\\"\\n domain_id = \\\"default\\\"\\n\\n# Configuration for telegraf agent\\n[agent]\\n interval = \\\"60s\\\"\\n debug = false\\n hostname = \\\"\\\"\\n round_interval = true\\n flush_interval = \\\"60s\\\"\\n flush_jitter = \\\"0s\\\"\\n collection_jitter = \\\"0s\\\"\\n metric_batch_size = 1000\\n metric_buffer_limit = 10000\\n quiet = false\\n logfile = \\\"/var/log/telegraf.log\\\"\\n logfile_rotation_max_size = \\\"10MB\\\"\\n logfile_rotation_max_archives = 1\\n omit_hostname = true\\n\\n###############################################################################\\n# OUTPUTS #\\n###############################################################################\\n\\n[[outputs.influxdb]]\\n urls = [\\\"http://169.254.169.254/monitor\\\"]\\n database = \\\"telegraf\\\"\\n insecure_skip_verify = true\\n\\n###############################################################################\\n# INPUTS #\\n###############################################################################\\n[[inputs.cpu]]\\n name_prefix = \\\"agent_\\\"\\n percpu = true\\n totalcpu = true\\n collect_cpu_time = false\\n report_active = true\\n[[inputs.disk]]\\n name_prefix = \\\"agent_\\\"\\n ignore_fs = [\\\"tmpfs\\\", \\\"devtmpfs\\\", \\\"overlay\\\", \\\"squashfs\\\", \\\"iso9660\\\"]\\n[[inputs.diskio]]\\n name_prefix = \\\"agent_\\\"\\n skip_serial_number = false\\n[[inputs.kernel]]\\n name_prefix = \\\"agent_\\\"\\n[[inputs.kernel_vmstat]]\\n name_prefix = \\\"agent_\\\"\\n[[inputs.mem]]\\n name_prefix = \\\"agent_\\\"\\n[[inputs.processes]]\\n name_prefix = \\\"agent_\\\"\\n[[inputs.swap]]\\n name_prefix = \\\"agent_\\\"\\n[[inputs.system]]\\n name_prefix = \\\"agent_\\\"\\n[[inputs.net]]\\n name_prefix = \\\"agent_\\\"\\n[[inputs.netstat]]\\n name_prefix = \\\"agent_\\\"\\n[[inputs.nstat]]\\n name_prefix = \\\"agent_\\\"\\n[[inputs.internal]]\\n name_prefix = \\\"agent_\\\"\\n collect_memstats = false\\n\"}}}'" error: Process exited with status 2, cmd error: [info 240130 09:13:11 options.parseOptions(options.go:331)] Use configuration file: /opt/yunion/host.conf
fatal error: sync: unlock of unlocked mutex
[error 2024-01-30 12:53:01 guestman.(*SKVMGuestInstance).StartMonitor(qemu-kvm.go:824)] Guest 289287d0-4d9c-4605-89f7-69dd878143a9 start monitor failed, can't get qmp monitor port or monitor path
[error 2024-01-30 12:53:01 guestman.(*SKVMGuestInstance).StartMonitor(qemu-kvm.go:824)] Guest 289287d0-4d9c-4605-89f7-69dd878143a9 start monitor failed, can't get qmp monitor port or monitor path
[error 2024-01-30 12:53:03 cgrouputils.(*CGroupTask).createTask(cgrouputils.go:236)] mkdir /sys/fs/cgroup/memory/cloudpods.hostagent/server_289287d0-4d9c-4605-89f7-69dd878143a9_20338: no such file or directory
@chenjacken 感谢反馈,部署失败的问题我们看一下 上传镜像到镜像管理服务会做一次 qemu-img convert ,这个步骤可能会很慢,并且这个转换镜像的过程中 IO 比较重。
@chenjacken 感谢反馈,部署失败的问题我们看一下 上传镜像到镜像管理服务会做一次 qemu-img convert ,这个步骤可能会很慢,并且这个转换镜像的过程中 IO 比较重。
好的,谢谢!!
另外,状态比较慢的2个步骤是:缓存镜像
和分配磁盘
:
1,缓存镜像
把镜像文件从minio传到ceph吗?如果优化速度
2,分配磁盘
,镜像已经缓存到ceph,分配理应也很快,但是这个状态的时间也比较长。
@chenjacken 感谢反馈,部署失败的问题我们看一下 上传镜像到镜像管理服务会做一次 qemu-img convert ,这个步骤可能会很慢,并且这个转换镜像的过程中 IO 比较重。
好的,谢谢!! 另外,状态比较慢的2个步骤是:
缓存镜像
和分配磁盘
: 1,缓存镜像
把镜像文件从minio传到ceph吗?如果优化速度 2,分配磁盘
,镜像已经缓存到ceph,分配理应也很快,但是这个状态的时间也比较长。
镜像文件是50G的大小
缓存镜像
用了1个小时
分配磁盘
用过了1个小时30分钟,结果显示部署失败了
然后同步状态,虚拟机显示关机,开机该虚拟机,就正常运行中
web显示的日志是:
deploying=>deploy_fail: {"__reason__":"Deploy guest fs: request deploy guest fs: rpc error: code = Unknown desc = run deploy_guest_fs failed []: \"/opt/yunion/bin/host-deployer --common-config-file /opt/yunion/common.conf --config /opt/yunion/host.conf --deploy-action deploy_guest_fs --deploy-params '{\\\"disk_info\\\":{\\\"path\\\":\\\"rbd:nvmepool/2c87c588-0f36-477b-8ee7-4818c8d585f9:mon_host=172.16.1.216\\\\\\\\;172.16.1.218\\\\\\\\;172.16.1.217:key=AQBz5pZlFX41OBAAJqPwV73/Zxc0nKEjdGb0uw\\\\\\\\=\\\\\\\\=:rados_mon_op_timeout=5:rados_osd_op_timeout=1200:client_mount_timeout=120\\\"},\\\"guest_desc\\\":{\\\"name\\\":\\\"yudao4\\\",\\\"uuid\\\":\\\"584e3a8d-6780-4578-831e-44dcbcd99ca6\\\",\\\"domain\\\":\\\"cloud.onecloud.io\\\",\\\"nics\\\":[{\\\"mac\\\":\\\"00:22:fe:af:dc:53\\\",\\\"ip\\\":\\\"172.16.1.94\\\",\\\"net\\\":\\\"static\\\",\\\"net_id\\\":\\\"ce676c71-febf-4ca1-8ecf-6add3aa5215e\\\",\\\"gateway\\\":\\\"172.16.1.1\\\",\\\"dns\\\":\\\"172.16.1.200\\\",\\\"domain\\\":\\\"cloud.onecloud.io\\\",\\\"ifname\\\":\\\"static-94\\\",\\\"masklen\\\":24,\\\"driver\\\":\\\"virtio\\\",\\\"bridge\\\":\\\"br1\\\",\\\"wire_id\\\":\\\"2a4e5367-e4c5-4410-81dd-217698d99ff2\\\",\\\"vlan\\\":1,\\\"interface\\\":\\\"bond0\\\",\\\"bw\\\":1000,\\\"mtu\\\":1500}],\\\"disks\\\":[{\\\"disk_id\\\":\\\"2c87c588-0f36-477b-8ee7-4818c8d585f9\\\",\\\"driver\\\":\\\"scsi\\\",\\\"cache_mode\\\":\\\"none\\\",\\\"aio_mode\\\":\\\"native\\\",\\\"size\\\":51200,\\\"template_id\\\":\\\"5fd11cb5-ad0a-419e-8ba0-a77d009d60d6\\\",\\\"storage_id\\\":\\\"1b298235-a82f-4579-8b7a-e6dd2d9916d3\\\",\\\"path\\\":\\\"rbd:nvmepool/2c87c588-0f36-477b-8ee7-4818c8d585f9\\\",\\\"format\\\":\\\"raw\\\"}],\\\"Hypervisor\\\":\\\"kvm\\\",\\\"hostname\\\":\\\"yudao4\\\"},\\\"deploy_info\\\":{\\\"public_key\\\":{\\\"admin_public_key\\\":\\\"ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDG7U+zsDlTXjbDWg4/C0NElAGPJ2CXrs8dh89ftJFjPbB5W9ghrVoen4UTBBm6GqXc4hl5zGVM2zL2H31n85HfYgBo47uKFEKu9c4DpSdiTBf15zBEvhNZziOJ0FEhwglZ1WRvSKDd2+3AH23WMp++btcz/ruhbib2mdUW9nwfQj783Sl+WfJ9Ss6p3RthRtolDxrpSXAIP5KH41jwYvCLPMLBndh5sz3fHuB6AfpbjYgG++pBrhf0rtemj5f1ZtgbvQ5IlYs5L1QUcctA6BbzwlRPbaNvSaM6+hjiU3g7Fm68qmT+4uNBRVKqip0hBkMBJSW8A8ZUSLIvP4G4DDXF\\\\n\\\",\\\"project_public_key\\\":\\\"ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDQxBHbbAyqBKf71sa4+xLV/9gTkZe7kIJgSyU+9ViGqfzN9B0TjBqL4pnZujHUl4Gch4EK9TGg3FtQNWTBHETRMaB4JVrjSpu4uXEYRj3EVVqJKCwwWNOoy4hj7eHmEaAFkw8CVNvBlJAPFXVXUIcPZplQQQI/Da5gUfZ8beGIlrhBWtz2Julw/5sxPiaENm2PPItiw6iZnPZ88/bZCvSHy0Cx2odZE3TJrN3H5Zob/3O09n8wCqPUrvMz9ibKb9z5iT0ANLnKtSCQW1xxIml5JlSFLEPPKFEyCdrE2mTsfPp7Gc+BUD9/KZy+8hih6gfS+dL1kK6OPOVfJLxDcjNJ\\\\n\\\"},\\\"is_init\\\":true,\\\"default_root_user\\\":true,\\\"windows_default_admin_user\\\":true,\\\"telegraf\\\":{\\\"telegraf_conf\\\":\\\"### MANAGED BY ansible-telegraf ANSIBLE ROLE ###\\\\n\\\\n[global_tags]\\\\n\\\\n os_type = \\\\\\\"Linux\\\\\\\"\\\\n status = \\\\\\\"start_deploy\\\\\\\"\\\\n tenant_id = \\\\\\\"2e152fe0619046a38081d7e487028358\\\\\\\"\\\\n scaling_group_id = \\\\\\\"\\\\\\\"\\\\n domain_id = \\\\\\\"default\\\\\\\"\\\\n vm_name = \\\\\\\"yudao4\\\\\\\"\\\\n zone = \\\\\\\"华南-广州\\\\\\\"\\\\n zone_id = \\\\\\\"7b6ae896-1b3d-40e5-879f-cfd00799200b\\\\\\\"\\\\n tenant = \\\\\\\"system\\\\\\\"\\\\n host = \\\\\\\"node5-172-16-1-218\\\\\\\"\\\\n host_id = \\\\\\\"a97714c5-543d-40ce-8098-414f4fbb9e25\\\\\\\"\\\\n vm_ip = \\\\\\\"172.16.1.94\\\\\\\"\\\\n region_ext_id = \\\\\\\"\\\\\\\"\\\\n brand = \\\\\\\"OneCloud\\\\\\\"\\\\n project_domain = \\\\\\\"Default\\\\\\\"\\\\n vm_id = \\\\\\\"584e3a8d-6780-4578-831e-44dcbcd99ca6\\\\\\\"\\\\n zone_ext_id = \\\\\\\"\\\\\\\"\\\\n cloudregion = \\\\\\\"Default\\\\\\\"\\\\n cloudregion_id = \\\\\\\"default\\\\\\\"\\\\n\\\\n# Configuration for telegraf agent\\\\n[agent]\\\\n interval = \\\\\\\"60s\\\\\\\"\\\\n debug = false\\\\n hostname = \\\\\\\"\\\\\\\"\\\\n round_interval = true\\\\n flush_interval = \\\\\\\"60s\\\\\\\"\\\\n flush_jitter = \\\\\\\"0s\\\\\\\"\\\\n collection_jitter = \\\\\\\"0s\\\\\\\"\\\\n metric_batch_size = 1000\\\\n metric_buffer_limit = 10000\\\\n quiet = false\\\\n logfile = \\\\\\\"/var/log/telegraf.log\\\\\\\"\\\\n logfile_rotation_max_size = \\\\\\\"10MB\\\\\\\"\\\\n logfile_rotation_max_archives = 1\\\\n omit_hostname = true\\\\n\\\\n###############################################################################\\\\n# OUTPUTS #\\\\n###############################################################################\\\\n\\\\n[[outputs.influxdb]]\\\\n urls = [\\\\\\\"http://169.254.169.254/monitor\\\\\\\"]\\\\n database = \\\\\\\"telegraf\\\\\\\"\\\\n insecure_skip_verify = true\\\\n\\\\n###############################################################################\\\\n# INPUTS #\\\\n###############################################################################\\\\n[[inputs.cpu]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n percpu = true\\\\n totalcpu = true\\\\n collect_cpu_time = false\\\\n report_active = true\\\\n[[inputs.disk]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n ignore_fs = [\\\\\\\"tmpfs\\\\\\\", \\\\\\\"devtmpfs\\\\\\\", \\\\\\\"overlay\\\\\\\", \\\\\\\"squashfs\\\\\\\", \\\\\\\"iso9660\\\\\\\"]\\\\n[[inputs.diskio]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n skip_serial_number = false\\\\n[[inputs.kernel]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.kernel_vmstat]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.mem]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.processes]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.swap]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.system]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.net]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.netstat]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.nstat]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.internal]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n collect_memstats = false\\\\n\\\"}}}'\" error: Process exited with status 2, cmd error: [info 240131 03:58:36 options.parseOptions(options.go:331)] Use configuration file: /opt/yunion/host.conf\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ovn-mapped-bridge\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ovn-underlay-mtu\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument auto-merge-delay-seconds\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument cdrom-count\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument live-migrate-cpu-throttle-max\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument min-migrate-timeout-seconds\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument host-health-timeout\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument max-reserved-memory\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument set-vnc-password\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument zero-clean-disk-data\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument allow-switch-vms\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-kvm\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument recycle-diskfile-keep-days\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument sdn-allow-conntrack-invalid\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument sdn-enable-tap-man\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument sdn-socket-path\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument tunnel-padding-bytes\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument local-backup-temp-path\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument bridge-driver\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument default-image-save-format\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument dhcp-renewal-time\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-custom-device\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument floppy-count\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument agent-temp-limit\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument recycle-diskfile\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument default-write-iops-per-cpu\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument local-config-file\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-qemu-debug-log\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument image-cache-path\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument windows-default-admin-user\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument bw-download-bandwidth\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument etcd-use-tls\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument report-interval\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ethtool-enable-gso\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ping-region-interval\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument host-lease-timeout\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument bandwidth-limit\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-cpu-binding\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument manage-ntp-configuration\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument sdn-enable-eip-man\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument fetcherfs-path\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument servers-path\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument block-io-scheduler\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument default-read-iops-per-cpu\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-gpu\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-telegraf\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument always-recycle-diskfile\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument default-read-bps-per-cpu\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-usb\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-vm-uuid\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument snapshot-recycle-day\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-openflow-controller\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument fetcherfs-block-size\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument kubelet-run-directory\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument migrate-expect-rate\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ovn-encap-ip\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument slots\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-set-cgroup\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument allow-router-vms\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument binary-memclean-path\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-local-vpc\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument host-type\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument default-write-bps-per-cpu\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument host-cpu-passthrough\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ovn-integration-bridge\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument pcie-root-port-count\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-probe-kubelet\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-monitor\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-template-backing\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ovn-south-database\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument rack\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument linux-default-root-user\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument max-hotplug-vcpu-count\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument sdn-enable-guest-man\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ovmf-path\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument dhcp-lease-time\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument no-hpet\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument restrict-qemu-img-convert-worker\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument tap-bridge-name\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument dhcp-server-port\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument local-image-path\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument memory-snapshots-path\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument sdn-pid-file\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument use-boot-vga\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument default-live-migrate-downtime\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-ksm\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ovn-eip-bridge\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument snapshot-dir-suffix\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument check-system-services\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-virtio-rng-device\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument local-backup-storage-path\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument default-request-worker-count\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument sdn-enable-tc-man\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-security-group\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument sync-storage-info-duration-second\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument agent-temp-path\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument auto-merge-backing-template\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-fallocate-disk\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument etcd-skip-tls-verify\n[info 240131 03:58:36 options.parseOptions(options.go:354)] Set log level to \"info\"\n[info 2024-01-31 03:58:36 options.parseOptions(options.go:331)] Use configuration file: /opt/yunion/common.conf\n[warning 2024-01-31 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument floppy-count\n[warning 2024-01-31 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-local-vpc\n[warning 2024-01-31 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument default-qemu-version\n[warning 2024-01-31 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument host-cpu-passthrough\n[warning 2024-01-31 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument cdrom-count\n[warning 2024-01-31 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-security-group\n[warning 2024-01-31 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument live-migrate-cpu-throttle-max\n[warning 2024-01-31 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument manage-ntp-configuration\n[warning 2024-01-31 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument no-hpet\n[info 2024-01-31 03:58:36 options.parseOptions(options.go:354)] Set log level to \"info\"\n[info 2024-01-31 03:58:36 procutils.WaitZombieLoop(zombie_others.go:36)] My pid is not 1 and no need to wait zombies\n[info 2024-01-31 03:58:36 deployserver.(*SDeployService).InitService(deployserver.go:454)] exec socket path: /var/run/onecloud/exec.sock\nfatal error: sync: unlock of unlocked mutex\n\ngoroutine 1 [running]:\nruntime.throw({0x111a4d0?, 0xc0001d8380?})\n\t/opt/go/src/runtime/panic.go:992 +0x71 fp=0xc0008e5928 sp=0xc0008e58f8 pc=0x4379d1\nsync.throw({0x111a4d0?, 0xf19900?})\n\t/opt/go/src/runtime/panic.go:978 +0x1e fp=0xc0008e5948 sp=0xc0008e5928 pc=0x4656de\nsync.(*Mutex).unlockSlow(0xc0003b3a70, 0xffffffff)\n\t/opt/go/src/sync/mutex.go:220 +0x3c fp=0xc0008e5970 sp=0xc0008e5948 pc=0x474c1c\nsync.(*Mutex).Unlock(...)\n\t/opt/go/src/sync/mutex.go:214\nyunion.io/x/onecloud/pkg/util/xfsutils.UnlockXfsPartition({0xc00073e2b1, 0x24})\n\t/root/go/src/yunion.io/x/onecloud/pkg/util/xfsutils/lock.go:48 +0xf4 fp=0xc0008e59d0 sp=0xc0008e5970 pc=0xd1a8d4\nyunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart.(*SKVMGuestDiskPartition).Umount.func1()\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart/kvmpart.go:283 +0x45 fp=0xc0008e59f0 sp=0xc0008e59d0 pc=0xd27945\nyunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart.(*SKVMGuestDiskPartition).Umount(0xc00007ac60)\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart/kvmpart.go:303 +0x699 fp=0xc0008e5b88 sp=0xc0008e59f0 pc=0xd277d9\nyunion.io/x/onecloud/pkg/hostman/diskutils/qemu_kvm.(*LocalDiskDriver).UmountRootfs(0xc00023f180?, {0x12c14a0?, 0xc000010120?})\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/diskutils/qemu_kvm/local_driver.go:117 +0x3b fp=0xc0008e5ba0 sp=0xc0008e5b88 pc=0xe7a7fb\nyunion.io/x/onecloud/pkg/hostman/diskutils/fsutils.DeployGuestfs.func1()\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/diskutils/fsutils/fsutils.go:475 +0x36 fp=0xc0008e5bc8 sp=0xc0008e5ba0 pc=0xd20a16\nyunion.io/x/onecloud/pkg/hostman/diskutils/fsutils.DeployGuestfs({0x12bdd48, 0xc0006ce840}, 0xc0007000f0)\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/diskutils/fsutils/fsutils.go:485 +0x366 fp=0xc0008e5c90 sp=0xc0008e5bc8 pc=0xd20906\nyunion.io/x/onecloud/pkg/hostman/diskutils/qemu_kvm.(*LocalDiskDriver).DeployGuestfs(0xc0006ce840?, 0x0?)\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/diskutils/qemu_kvm/local_driver.go:123 +0x26 fp=0xc0008e5cb8 sp=0xc0008e5c90 pc=0xe7a866\nyunion.io/x/onecloud/pkg/hostman/diskutils.(*SKVMGuestDisk).DeployGuestfs(...)\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/diskutils/kvm.go:144\nyunion.io/x/onecloud/pkg/hostman/hostdeployer/deployserver.(*LocalDeploy).DeployGuestFs(0xc000718000?, 0xc0007000f0)\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/hostdeployer/deployserver/localdeploy.go:46 +0x184 fp=0xc0008e5d88 sp=0xc0008e5cb8 pc=0xe85c84\nyunion.io/x/onecloud/pkg/hostman/hostdeployer/deployserver.StartLocalDeploy({0x7ffc852bbcba?, 0x4?})\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/hostdeployer/deployserver/localdeploy.go:130 +0x2a8 fp=0xc0008e5de8 sp=0xc0008e5d88 pc=0xe86dc8\nyunion.io/x/onecloud/pkg/hostman/hostdeployer/deployserver.(*SDeployService).RunService(0xc00017d000?)\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/hostdeployer/deployserver/deployserver.go:266 +0x5b fp=0xc0008e5ed0 sp=0xc0008e5de8 pc=0xe83cfb\nyunion.io/x/onecloud/pkg/cloudcommon/service.(*SServiceBase).StartService(0xc0000af458)\n\t/root/go/src/yunion.io/x/onecloud/pkg/cloudcommon/service/services.go:58 +0xfa fp=0xc0008e5f50 sp=0xc0008e5ed0 pc=0xb70e5a\nmain.main()\n\t/root/go/src/yunion.io/x/onecloud/cmd/host-deployer/main.go:28 +0xe5 fp=0xc0008e5f80 sp=0xc0008e5f50 pc=0xe87625\nruntime.main()\n\t/opt/go/src/runtime/proc.go:250 +0x212 fp=0xc0008e5fe0 sp=0xc0008e5f80 pc=0x43a0f2\nruntime.goexit()\n\t/opt/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc0008e5fe8 sp=0xc0008e5fe0 pc=0x46aa61\n\ngoroutine 6 [chan receive, 1 minutes]:\nyunion.io/x/pkg/util/signalutils.StartTrap.func1()\n\t/root/go/src/yunion.io/x/onecloud/vendor/yunion.io/x/pkg/util/signalutils/signalutils.go:72 +0xa7\ncreated by yunion.io/x/pkg/util/signalutils.StartTrap\n\t/root/go/src/yunion.io/x/onecloud/vendor/yunion.io/x/pkg/util/signalutils/signalutils.go:62 +0xd4\n\ngoroutine 23 [syscall, 1 minutes]:\nos/signal.signal_recv()\n\t/opt/go/src/runtime/sigqueue.go:151 +0x2f\nos/signal.loop()\n\t/opt/go/src/os/signal/signal_unix.go:23 +0x19\ncreated by os/signal.Notify.func1.1\n\t/opt/go/src/os/signal/signal.go:151 +0x2a\n\ngoroutine 15 [chan send, 1 minutes]:\nyunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart.(*SKVMGuestDiskPartition).mount.func2.1()\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart/kvmpart.go:189 +0x24b\ncreated by yunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart.(*SKVMGuestDiskPartition).mount.func2\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart/kvmpart.go:186 +0xe5\n","__stage__":"OnDeployGuestComplete","__status__":"error"}
{
"__reason__": "Deploy guest fs: request deploy guest fs: rpc error: code = Unknown desc = run deploy_guest_fs failed []: \"/opt/yunion/bin/host-deployer --common-config-file /opt/yunion/common.conf --config /opt/yunion/host.conf --deploy-action deploy_guest_fs --deploy-params '{\\\"disk_info\\\":{\\\"path\\\":\\\"rbd:nvmepool/2c87c588-0f36-477b-8ee7-4818c8d585f9:mon_host=172.16.1.216\\\\\\\\;172.16.1.218\\\\\\\\;172.16.1.217:key=AQBz5pZlFX41OBAAJqPwV73/Zxc0nKEjdGb0uw\\\\\\\\=\\\\\\\\=:rados_mon_op_timeout=5:rados_osd_op_timeout=1200:client_mount_timeout=120\\\"},\\\"guest_desc\\\":{\\\"name\\\":\\\"yudao4\\\",\\\"uuid\\\":\\\"584e3a8d-6780-4578-831e-44dcbcd99ca6\\\",\\\"domain\\\":\\\"cloud.onecloud.io\\\",\\\"nics\\\":[{\\\"mac\\\":\\\"00:22:fe:af:dc:53\\\",\\\"ip\\\":\\\"172.16.1.94\\\",\\\"net\\\":\\\"static\\\",\\\"net_id\\\":\\\"ce676c71-febf-4ca1-8ecf-6add3aa5215e\\\",\\\"gateway\\\":\\\"172.16.1.1\\\",\\\"dns\\\":\\\"172.16.1.200\\\",\\\"domain\\\":\\\"cloud.onecloud.io\\\",\\\"ifname\\\":\\\"static-94\\\",\\\"masklen\\\":24,\\\"driver\\\":\\\"virtio\\\",\\\"bridge\\\":\\\"br1\\\",\\\"wire_id\\\":\\\"2a4e5367-e4c5-4410-81dd-217698d99ff2\\\",\\\"vlan\\\":1,\\\"interface\\\":\\\"bond0\\\",\\\"bw\\\":1000,\\\"mtu\\\":1500}],\\\"disks\\\":[{\\\"disk_id\\\":\\\"2c87c588-0f36-477b-8ee7-4818c8d585f9\\\",\\\"driver\\\":\\\"scsi\\\",\\\"cache_mode\\\":\\\"none\\\",\\\"aio_mode\\\":\\\"native\\\",\\\"size\\\":51200,\\\"template_id\\\":\\\"5fd11cb5-ad0a-419e-8ba0-a77d009d60d6\\\",\\\"storage_id\\\":\\\"1b298235-a82f-4579-8b7a-e6dd2d9916d3\\\",\\\"path\\\":\\\"rbd:nvmepool/2c87c588-0f36-477b-8ee7-4818c8d585f9\\\",\\\"format\\\":\\\"raw\\\"}],\\\"Hypervisor\\\":\\\"kvm\\\",\\\"hostname\\\":\\\"yudao4\\\"},\\\"deploy_info\\\":{\\\"public_key\\\":{\\\"admin_public_key\\\":\\\"ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDG7U+zsDlTXjbDWg4/C0NElAGPJ2CXrs8dh89ftJFjPbB5W9ghrVoen4UTBBm6GqXc4hl5zGVM2zL2H31n85HfYgBo47uKFEKu9c4DpSdiTBf15zBEvhNZziOJ0FEhwglZ1WRvSKDd2+3AH23WMp++btcz/ruhbib2mdUW9nwfQj783Sl+WfJ9Ss6p3RthRtolDxrpSXAIP5KH41jwYvCLPMLBndh5sz3fHuB6AfpbjYgG++pBrhf0rtemj5f1ZtgbvQ5IlYs5L1QUcctA6BbzwlRPbaNvSaM6+hjiU3g7Fm68qmT+4uNBRVKqip0hBkMBJSW8A8ZUSLIvP4G4DDXF\\\\n\\\",\\\"project_public_key\\\":\\\"ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDQxBHbbAyqBKf71sa4+xLV/9gTkZe7kIJgSyU+9ViGqfzN9B0TjBqL4pnZujHUl4Gch4EK9TGg3FtQNWTBHETRMaB4JVrjSpu4uXEYRj3EVVqJKCwwWNOoy4hj7eHmEaAFkw8CVNvBlJAPFXVXUIcPZplQQQI/Da5gUfZ8beGIlrhBWtz2Julw/5sxPiaENm2PPItiw6iZnPZ88/bZCvSHy0Cx2odZE3TJrN3H5Zob/3O09n8wCqPUrvMz9ibKb9z5iT0ANLnKtSCQW1xxIml5JlSFLEPPKFEyCdrE2mTsfPp7Gc+BUD9/KZy+8hih6gfS+dL1kK6OPOVfJLxDcjNJ\\\\n\\\"},\\\"is_init\\\":true,\\\"default_root_user\\\":true,\\\"windows_default_admin_user\\\":true,\\\"telegraf\\\":{\\\"telegraf_conf\\\":\\\"### MANAGED BY ansible-telegraf ANSIBLE ROLE ###\\\\n\\\\n[global_tags]\\\\n\\\\n os_type = \\\\\\\"Linux\\\\\\\"\\\\n status = \\\\\\\"start_deploy\\\\\\\"\\\\n tenant_id = \\\\\\\"2e152fe0619046a38081d7e487028358\\\\\\\"\\\\n scaling_group_id = \\\\\\\"\\\\\\\"\\\\n domain_id = \\\\\\\"default\\\\\\\"\\\\n vm_name = \\\\\\\"yudao4\\\\\\\"\\\\n zone = \\\\\\\"华南-广州\\\\\\\"\\\\n zone_id = \\\\\\\"7b6ae896-1b3d-40e5-879f-cfd00799200b\\\\\\\"\\\\n tenant = \\\\\\\"system\\\\\\\"\\\\n host = \\\\\\\"node5-172-16-1-218\\\\\\\"\\\\n host_id = \\\\\\\"a97714c5-543d-40ce-8098-414f4fbb9e25\\\\\\\"\\\\n vm_ip = \\\\\\\"172.16.1.94\\\\\\\"\\\\n region_ext_id = \\\\\\\"\\\\\\\"\\\\n brand = \\\\\\\"OneCloud\\\\\\\"\\\\n project_domain = \\\\\\\"Default\\\\\\\"\\\\n vm_id = \\\\\\\"584e3a8d-6780-4578-831e-44dcbcd99ca6\\\\\\\"\\\\n zone_ext_id = \\\\\\\"\\\\\\\"\\\\n cloudregion = \\\\\\\"Default\\\\\\\"\\\\n cloudregion_id = \\\\\\\"default\\\\\\\"\\\\n\\\\n# Configuration for telegraf agent\\\\n[agent]\\\\n interval = \\\\\\\"60s\\\\\\\"\\\\n debug = false\\\\n hostname = \\\\\\\"\\\\\\\"\\\\n round_interval = true\\\\n flush_interval = \\\\\\\"60s\\\\\\\"\\\\n flush_jitter = \\\\\\\"0s\\\\\\\"\\\\n collection_jitter = \\\\\\\"0s\\\\\\\"\\\\n metric_batch_size = 1000\\\\n metric_buffer_limit = 10000\\\\n quiet = false\\\\n logfile = \\\\\\\"/var/log/telegraf.log\\\\\\\"\\\\n logfile_rotation_max_size = \\\\\\\"10MB\\\\\\\"\\\\n logfile_rotation_max_archives = 1\\\\n omit_hostname = true\\\\n\\\\n###############################################################################\\\\n# OUTPUTS #\\\\n###############################################################################\\\\n\\\\n[[outputs.influxdb]]\\\\n urls = [\\\\\\\"http://169.254.169.254/monitor\\\\\\\"]\\\\n database = \\\\\\\"telegraf\\\\\\\"\\\\n insecure_skip_verify = true\\\\n\\\\n###############################################################################\\\\n# INPUTS #\\\\n###############################################################################\\\\n[[inputs.cpu]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n percpu = true\\\\n totalcpu = true\\\\n collect_cpu_time = false\\\\n report_active = true\\\\n[[inputs.disk]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n ignore_fs = [\\\\\\\"tmpfs\\\\\\\", \\\\\\\"devtmpfs\\\\\\\", \\\\\\\"overlay\\\\\\\", \\\\\\\"squashfs\\\\\\\", \\\\\\\"iso9660\\\\\\\"]\\\\n[[inputs.diskio]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n skip_serial_number = false\\\\n[[inputs.kernel]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.kernel_vmstat]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.mem]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.processes]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.swap]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.system]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.net]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.netstat]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.nstat]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.internal]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n collect_memstats = false\\\\n\\\"}}}'\" error: Process exited with status 2, cmd error: [info 240131 03:58:36 options.parseOptions(options.go:331)] Use configuration file: /opt/yunion/host.conf\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ovn-mapped-bridge\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ovn-underlay-mtu\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument auto-merge-delay-seconds\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument cdrom-count\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument live-migrate-cpu-throttle-max\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument min-migrate-timeout-seconds\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument host-health-timeout\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument max-reserved-memory\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument set-vnc-password\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument zero-clean-disk-data\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument allow-switch-vms\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-kvm\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument recycle-diskfile-keep-days\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument sdn-allow-conntrack-invalid\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument sdn-enable-tap-man\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument sdn-socket-path\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument tunnel-padding-bytes\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument local-backup-temp-path\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument bridge-driver\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument default-image-save-format\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument dhcp-renewal-time\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-custom-device\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument floppy-count\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument agent-temp-limit\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument recycle-diskfile\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument default-write-iops-per-cpu\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument local-config-file\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-qemu-debug-log\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument image-cache-path\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument windows-default-admin-user\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument bw-download-bandwidth\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument etcd-use-tls\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument report-interval\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ethtool-enable-gso\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ping-region-interval\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument host-lease-timeout\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument bandwidth-limit\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-cpu-binding\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument manage-ntp-configuration\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument sdn-enable-eip-man\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument fetcherfs-path\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument servers-path\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument block-io-scheduler\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument default-read-iops-per-cpu\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-gpu\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-telegraf\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument always-recycle-diskfile\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument default-read-bps-per-cpu\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-usb\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-vm-uuid\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument snapshot-recycle-day\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-openflow-controller\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument fetcherfs-block-size\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument kubelet-run-directory\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument migrate-expect-rate\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ovn-encap-ip\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument slots\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-set-cgroup\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument allow-router-vms\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument binary-memclean-path\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-local-vpc\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument host-type\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument default-write-bps-per-cpu\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument host-cpu-passthrough\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ovn-integration-bridge\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument pcie-root-port-count\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-probe-kubelet\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-monitor\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-template-backing\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ovn-south-database\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument rack\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument linux-default-root-user\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument max-hotplug-vcpu-count\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument sdn-enable-guest-man\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ovmf-path\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument dhcp-lease-time\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument no-hpet\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument restrict-qemu-img-convert-worker\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument tap-bridge-name\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument dhcp-server-port\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument local-image-path\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument memory-snapshots-path\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument sdn-pid-file\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument use-boot-vga\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument default-live-migrate-downtime\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-ksm\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ovn-eip-bridge\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument snapshot-dir-suffix\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument check-system-services\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-virtio-rng-device\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument local-backup-storage-path\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument default-request-worker-count\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument sdn-enable-tc-man\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-security-group\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument sync-storage-info-duration-second\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument agent-temp-path\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument auto-merge-backing-template\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-fallocate-disk\n[warning 240131 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument etcd-skip-tls-verify\n[info 240131 03:58:36 options.parseOptions(options.go:354)] Set log level to \"info\"\n[info 2024-01-31 03:58:36 options.parseOptions(options.go:331)] Use configuration file: /opt/yunion/common.conf\n[warning 2024-01-31 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument floppy-count\n[warning 2024-01-31 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-local-vpc\n[warning 2024-01-31 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument default-qemu-version\n[warning 2024-01-31 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument host-cpu-passthrough\n[warning 2024-01-31 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument cdrom-count\n[warning 2024-01-31 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-security-group\n[warning 2024-01-31 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument live-migrate-cpu-throttle-max\n[warning 2024-01-31 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument manage-ntp-configuration\n[warning 2024-01-31 03:58:36 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument no-hpet\n[info 2024-01-31 03:58:36 options.parseOptions(options.go:354)] Set log level to \"info\"\n[info 2024-01-31 03:58:36 procutils.WaitZombieLoop(zombie_others.go:36)] My pid is not 1 and no need to wait zombies\n[info 2024-01-31 03:58:36 deployserver.(*SDeployService).InitService(deployserver.go:454)] exec socket path: /var/run/onecloud/exec.sock\nfatal error: sync: unlock of unlocked mutex\n\ngoroutine 1 [running]:\nruntime.throw({0x111a4d0?, 0xc0001d8380?})\n\t/opt/go/src/runtime/panic.go:992 +0x71 fp=0xc0008e5928 sp=0xc0008e58f8 pc=0x4379d1\nsync.throw({0x111a4d0?, 0xf19900?})\n\t/opt/go/src/runtime/panic.go:978 +0x1e fp=0xc0008e5948 sp=0xc0008e5928 pc=0x4656de\nsync.(*Mutex).unlockSlow(0xc0003b3a70, 0xffffffff)\n\t/opt/go/src/sync/mutex.go:220 +0x3c fp=0xc0008e5970 sp=0xc0008e5948 pc=0x474c1c\nsync.(*Mutex).Unlock(...)\n\t/opt/go/src/sync/mutex.go:214\nyunion.io/x/onecloud/pkg/util/xfsutils.UnlockXfsPartition({0xc00073e2b1, 0x24})\n\t/root/go/src/yunion.io/x/onecloud/pkg/util/xfsutils/lock.go:48 +0xf4 fp=0xc0008e59d0 sp=0xc0008e5970 pc=0xd1a8d4\nyunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart.(*SKVMGuestDiskPartition).Umount.func1()\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart/kvmpart.go:283 +0x45 fp=0xc0008e59f0 sp=0xc0008e59d0 pc=0xd27945\nyunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart.(*SKVMGuestDiskPartition).Umount(0xc00007ac60)\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart/kvmpart.go:303 +0x699 fp=0xc0008e5b88 sp=0xc0008e59f0 pc=0xd277d9\nyunion.io/x/onecloud/pkg/hostman/diskutils/qemu_kvm.(*LocalDiskDriver).UmountRootfs(0xc00023f180?, {0x12c14a0?, 0xc000010120?})\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/diskutils/qemu_kvm/local_driver.go:117 +0x3b fp=0xc0008e5ba0 sp=0xc0008e5b88 pc=0xe7a7fb\nyunion.io/x/onecloud/pkg/hostman/diskutils/fsutils.DeployGuestfs.func1()\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/diskutils/fsutils/fsutils.go:475 +0x36 fp=0xc0008e5bc8 sp=0xc0008e5ba0 pc=0xd20a16\nyunion.io/x/onecloud/pkg/hostman/diskutils/fsutils.DeployGuestfs({0x12bdd48, 0xc0006ce840}, 0xc0007000f0)\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/diskutils/fsutils/fsutils.go:485 +0x366 fp=0xc0008e5c90 sp=0xc0008e5bc8 pc=0xd20906\nyunion.io/x/onecloud/pkg/hostman/diskutils/qemu_kvm.(*LocalDiskDriver).DeployGuestfs(0xc0006ce840?, 0x0?)\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/diskutils/qemu_kvm/local_driver.go:123 +0x26 fp=0xc0008e5cb8 sp=0xc0008e5c90 pc=0xe7a866\nyunion.io/x/onecloud/pkg/hostman/diskutils.(*SKVMGuestDisk).DeployGuestfs(...)\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/diskutils/kvm.go:144\nyunion.io/x/onecloud/pkg/hostman/hostdeployer/deployserver.(*LocalDeploy).DeployGuestFs(0xc000718000?, 0xc0007000f0)\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/hostdeployer/deployserver/localdeploy.go:46 +0x184 fp=0xc0008e5d88 sp=0xc0008e5cb8 pc=0xe85c84\nyunion.io/x/onecloud/pkg/hostman/hostdeployer/deployserver.StartLocalDeploy({0x7ffc852bbcba?, 0x4?})\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/hostdeployer/deployserver/localdeploy.go:130 +0x2a8 fp=0xc0008e5de8 sp=0xc0008e5d88 pc=0xe86dc8\nyunion.io/x/onecloud/pkg/hostman/hostdeployer/deployserver.(*SDeployService).RunService(0xc00017d000?)\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/hostdeployer/deployserver/deployserver.go:266 +0x5b fp=0xc0008e5ed0 sp=0xc0008e5de8 pc=0xe83cfb\nyunion.io/x/onecloud/pkg/cloudcommon/service.(*SServiceBase).StartService(0xc0000af458)\n\t/root/go/src/yunion.io/x/onecloud/pkg/cloudcommon/service/services.go:58 +0xfa fp=0xc0008e5f50 sp=0xc0008e5ed0 pc=0xb70e5a\nmain.main()\n\t/root/go/src/yunion.io/x/onecloud/cmd/host-deployer/main.go:28 +0xe5 fp=0xc0008e5f80 sp=0xc0008e5f50 pc=0xe87625\nruntime.main()\n\t/opt/go/src/runtime/proc.go:250 +0x212 fp=0xc0008e5fe0 sp=0xc0008e5f80 pc=0x43a0f2\nruntime.goexit()\n\t/opt/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc0008e5fe8 sp=0xc0008e5fe0 pc=0x46aa61\n\ngoroutine 6 [chan receive, 1 minutes]:\nyunion.io/x/pkg/util/signalutils.StartTrap.func1()\n\t/root/go/src/yunion.io/x/onecloud/vendor/yunion.io/x/pkg/util/signalutils/signalutils.go:72 +0xa7\ncreated by yunion.io/x/pkg/util/signalutils.StartTrap\n\t/root/go/src/yunion.io/x/onecloud/vendor/yunion.io/x/pkg/util/signalutils/signalutils.go:62 +0xd4\n\ngoroutine 23 [syscall, 1 minutes]:\nos/signal.signal_recv()\n\t/opt/go/src/runtime/sigqueue.go:151 +0x2f\nos/signal.loop()\n\t/opt/go/src/os/signal/signal_unix.go:23 +0x19\ncreated by os/signal.Notify.func1.1\n\t/opt/go/src/os/signal/signal.go:151 +0x2a\n\ngoroutine 15 [chan send, 1 minutes]:\nyunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart.(*SKVMGuestDiskPartition).mount.func2.1()\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart/kvmpart.go:189 +0x24b\ncreated by yunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart.(*SKVMGuestDiskPartition).mount.func2\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart/kvmpart.go:186 +0xe5\n",
"__stage__": "OnDeployGuestComplete",
"__status__": "error"
}
@chenjacken 感谢反馈,部署失败的问题我们看一下 上传镜像到镜像管理服务会做一次 qemu-img convert ,这个步骤可能会很慢,并且这个转换镜像的过程中 IO 比较重。
好的,谢谢!! 另外,状态比较慢的2个步骤是:
缓存镜像
和分配磁盘
: 1,缓存镜像
把镜像文件从minio传到ceph吗?如果优化速度 2,分配磁盘
,镜像已经缓存到ceph,分配理应也很快,但是这个状态的时间也比较长。
@chenjacken 镜像传的 ceph 这个过程是比较长的时间,取决于两边的 io 速度,这个动作也是一次性的,接下来创建应不会有这个问题了。缓存到 ceph 后创建应该是很快的,应该是部署异常了,这个貌似是 xfs 会有这个问题,我在我们环境复现一下
谢谢,辛苦了!🌹
宿主机宕机自动迁移时候,也会出错:
start_migrate=>migrate_failed: "{\"error\":{\"class\":\"ClientError\",\"code\":499,\"details\":\"Post \\\"https://172.16.1.218:8885/servers/b6e37fdf-7301-495f-8eb2-268bbdaa5b79/dest-prepare-migrate\\\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)\",\"request\":{\"body\":\"{\\\"desc\\\":{\\\"bios\\\":\\\"BIOS\\\",\\\"boot_order\\\":\\\"cdn\\\",\\\"cpu\\\":4,\\\"disks\\\":[{\\\"aio_mode\\\":\\\"native\\\",\\\"boot_index\\\":-1,\\\"bps...src_memory_snapshots\\\":[]}\",\"headers\":{\"Content-Length\":\"6678\",\"Content-Type\":\"application/json\",\"User-Agent\":\"yunioncloud-go/201708\",\"X-Auth-Token\":\"*\",\"X-Region-Version\":\"v2\",\"X-Task-Id\":\"e1defafc-849a-44f2-886d-9cfc16870d4d\",\"X-Task-Notify-Url\":\"https://default-region:30888/tasks/e1defafc-849a-44f2-886d-9cfc16870d4d\"},\"method\":\"POST\",\"url\":\"https://172.16.1.218:8885/servers/b6e37fdf-7301-495f-8eb2-268bbdaa5b79/dest-prepare-migrate\"}}}"
"{\"error\":{\"class\":\"ClientError\",\"code\":499,\"details\":\"Post \\\"https://172.16.1.218:8885/servers/b6e37fdf-7301-495f-8eb2-268bbdaa5b79/dest-prepare-migrate\\\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)\",\"request\":{\"body\":\"{\\\"desc\\\":{\\\"bios\\\":\\\"BIOS\\\",\\\"boot_order\\\":\\\"cdn\\\",\\\"cpu\\\":4,\\\"disks\\\":[{\\\"aio_mode\\\":\\\"native\\\",\\\"boot_index\\\":-1,\\\"bps...src_memory_snapshots\\\":[]}\",\"headers\":{\"Content-Length\":\"6678\",\"Content-Type\":\"application/json\",\"User-Agent\":\"yunioncloud-go/201708\",\"X-Auth-Token\":\"*\",\"X-Region-Version\":\"v2\",\"X-Task-Id\":\"e1defafc-849a-44f2-886d-9cfc16870d4d\",\"X-Task-Notify-Url\":\"https://default-region:30888/tasks/e1defafc-849a-44f2-886d-9cfc16870d4d\"},\"method\":\"POST\",\"url\":\"https://172.16.1.218:8885/servers/b6e37fdf-7301-495f-8eb2-268bbdaa5b79/dest-prepare-migrate\"}}}"
是在执行某些命令时候出错或者超时Client.Timeout exceeded while awaiting headers
?
宿主机宕机自动迁移时候,也会出错:
start_migrate=>migrate_failed: "{\"error\":{\"class\":\"ClientError\",\"code\":499,\"details\":\"Post \\\"https://172.16.1.218:8885/servers/b6e37fdf-7301-495f-8eb2-268bbdaa5b79/dest-prepare-migrate\\\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)\",\"request\":{\"body\":\"{\\\"desc\\\":{\\\"bios\\\":\\\"BIOS\\\",\\\"boot_order\\\":\\\"cdn\\\",\\\"cpu\\\":4,\\\"disks\\\":[{\\\"aio_mode\\\":\\\"native\\\",\\\"boot_index\\\":-1,\\\"bps...src_memory_snapshots\\\":[]}\",\"headers\":{\"Content-Length\":\"6678\",\"Content-Type\":\"application/json\",\"User-Agent\":\"yunioncloud-go/201708\",\"X-Auth-Token\":\"*\",\"X-Region-Version\":\"v2\",\"X-Task-Id\":\"e1defafc-849a-44f2-886d-9cfc16870d4d\",\"X-Task-Notify-Url\":\"https://default-region:30888/tasks/e1defafc-849a-44f2-886d-9cfc16870d4d\"},\"method\":\"POST\",\"url\":\"https://172.16.1.218:8885/servers/b6e37fdf-7301-495f-8eb2-268bbdaa5b79/dest-prepare-migrate\"}}}"
"{\"error\":{\"class\":\"ClientError\",\"code\":499,\"details\":\"Post \\\"https://172.16.1.218:8885/servers/b6e37fdf-7301-495f-8eb2-268bbdaa5b79/dest-prepare-migrate\\\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)\",\"request\":{\"body\":\"{\\\"desc\\\":{\\\"bios\\\":\\\"BIOS\\\",\\\"boot_order\\\":\\\"cdn\\\",\\\"cpu\\\":4,\\\"disks\\\":[{\\\"aio_mode\\\":\\\"native\\\",\\\"boot_index\\\":-1,\\\"bps...src_memory_snapshots\\\":[]}\",\"headers\":{\"Content-Length\":\"6678\",\"Content-Type\":\"application/json\",\"User-Agent\":\"yunioncloud-go/201708\",\"X-Auth-Token\":\"*\",\"X-Region-Version\":\"v2\",\"X-Task-Id\":\"e1defafc-849a-44f2-886d-9cfc16870d4d\",\"X-Task-Notify-Url\":\"https://default-region:30888/tasks/e1defafc-849a-44f2-886d-9cfc16870d4d\"},\"method\":\"POST\",\"url\":\"https://172.16.1.218:8885/servers/b6e37fdf-7301-495f-8eb2-268bbdaa5b79/dest-prepare-migrate\"}}}"
是在执行某些命令时候出错或者超时
Client.Timeout exceeded while awaiting headers
?
@chenjacken 看这个报错是迁移的目的端机器链接失败,目的端宿主机是否正常?
宿主机宕机自动迁移时候,也会出错:
start_migrate=>migrate_failed: "{\"error\":{\"class\":\"ClientError\",\"code\":499,\"details\":\"Post \\\"https://172.16.1.218:8885/servers/b6e37fdf-7301-495f-8eb2-268bbdaa5b79/dest-prepare-migrate\\\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)\",\"request\":{\"body\":\"{\\\"desc\\\":{\\\"bios\\\":\\\"BIOS\\\",\\\"boot_order\\\":\\\"cdn\\\",\\\"cpu\\\":4,\\\"disks\\\":[{\\\"aio_mode\\\":\\\"native\\\",\\\"boot_index\\\":-1,\\\"bps...src_memory_snapshots\\\":[]}\",\"headers\":{\"Content-Length\":\"6678\",\"Content-Type\":\"application/json\",\"User-Agent\":\"yunioncloud-go/201708\",\"X-Auth-Token\":\"*\",\"X-Region-Version\":\"v2\",\"X-Task-Id\":\"e1defafc-849a-44f2-886d-9cfc16870d4d\",\"X-Task-Notify-Url\":\"https://default-region:30888/tasks/e1defafc-849a-44f2-886d-9cfc16870d4d\"},\"method\":\"POST\",\"url\":\"https://172.16.1.218:8885/servers/b6e37fdf-7301-495f-8eb2-268bbdaa5b79/dest-prepare-migrate\"}}}"
"{\"error\":{\"class\":\"ClientError\",\"code\":499,\"details\":\"Post \\\"https://172.16.1.218:8885/servers/b6e37fdf-7301-495f-8eb2-268bbdaa5b79/dest-prepare-migrate\\\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)\",\"request\":{\"body\":\"{\\\"desc\\\":{\\\"bios\\\":\\\"BIOS\\\",\\\"boot_order\\\":\\\"cdn\\\",\\\"cpu\\\":4,\\\"disks\\\":[{\\\"aio_mode\\\":\\\"native\\\",\\\"boot_index\\\":-1,\\\"bps...src_memory_snapshots\\\":[]}\",\"headers\":{\"Content-Length\":\"6678\",\"Content-Type\":\"application/json\",\"User-Agent\":\"yunioncloud-go/201708\",\"X-Auth-Token\":\"*\",\"X-Region-Version\":\"v2\",\"X-Task-Id\":\"e1defafc-849a-44f2-886d-9cfc16870d4d\",\"X-Task-Notify-Url\":\"https://default-region:30888/tasks/e1defafc-849a-44f2-886d-9cfc16870d4d\"},\"method\":\"POST\",\"url\":\"https://172.16.1.218:8885/servers/b6e37fdf-7301-495f-8eb2-268bbdaa5b79/dest-prepare-migrate\"}}}"
是在执行某些命令时候出错或者超时
Client.Timeout exceeded while awaiting headers
?@chenjacken 看这个报错是迁移的目的端机器链接失败,目的端宿主机是否正常?
这个问题我再看看。谢谢!
用新镜像来创新虚拟机,同样的问题
{
"__reason__": "Deploy guest fs: request deploy guest fs: rpc error: code = Unknown desc = run deploy_guest_fs failed []: \"/opt/yunion/bin/host-deployer --common-config-file /opt/yunion/common.conf --config /opt/yunion/host.conf --deploy-action deploy_guest_fs --deploy-params '{\\\"disk_info\\\":{\\\"path\\\":\\\"rbd:nvmepool/9c380d1a-dcf8-443c-8646-9bf67a592158:mon_host=172.16.1.216\\\\\\\\;172.16.1.218\\\\\\\\;172.16.1.217:key=AQBz5pZlFX41OBAAJqPwV73/Zxc0nKEjdGb0uw\\\\\\\\=\\\\\\\\=:rados_mon_op_timeout=5:rados_osd_op_timeout=1200:client_mount_timeout=120\\\"},\\\"guest_desc\\\":{\\\"name\\\":\\\"HWSaaS\\\",\\\"uuid\\\":\\\"14333d0c-6241-4f78-8912-7834ac74d4d7\\\",\\\"domain\\\":\\\"cloud.onecloud.io\\\",\\\"nics\\\":[{\\\"mac\\\":\\\"00:22:31:ad:35:2d\\\",\\\"ip\\\":\\\"172.16.1.198\\\",\\\"net\\\":\\\"vm-static-net\\\",\\\"net_id\\\":\\\"e532366c-2ba4-4fed-895b-efd402812149\\\",\\\"gateway\\\":\\\"172.16.1.1\\\",\\\"dns\\\":\\\"172.16.1.200\\\",\\\"domain\\\":\\\"cloud.onecloud.io\\\",\\\"ifname\\\":\\\"dhcp1-dsf\\\",\\\"masklen\\\":24,\\\"driver\\\":\\\"virtio\\\",\\\"bridge\\\":\\\"br1\\\",\\\"wire_id\\\":\\\"2a4e5367-e4c5-4410-81dd-217698d99ff2\\\",\\\"vlan\\\":1,\\\"interface\\\":\\\"bond0\\\",\\\"bw\\\":1000,\\\"mtu\\\":1500}],\\\"disks\\\":[{\\\"disk_id\\\":\\\"9c380d1a-dcf8-443c-8646-9bf67a592158\\\",\\\"driver\\\":\\\"scsi\\\",\\\"cache_mode\\\":\\\"none\\\",\\\"aio_mode\\\":\\\"native\\\",\\\"size\\\":102400,\\\"template_id\\\":\\\"d4d0b10b-89d3-49c4-88c8-d3528312d5c1\\\",\\\"storage_id\\\":\\\"1b298235-a82f-4579-8b7a-e6dd2d9916d3\\\",\\\"path\\\":\\\"rbd:nvmepool/9c380d1a-dcf8-443c-8646-9bf67a592158\\\",\\\"format\\\":\\\"raw\\\"}],\\\"Hypervisor\\\":\\\"kvm\\\",\\\"hostname\\\":\\\"HWSaaS\\\"},\\\"deploy_info\\\":{\\\"public_key\\\":{\\\"admin_public_key\\\":\\\"ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDG7U+zsDlTXjbDWg4/C0NElAGPJ2CXrs8dh89ftJFjPbB5W9ghrVoen4UTBBm6GqXc4hl5zGVM2zL2H31n85HfYgBo47uKFEKu9c4DpSdiTBf15zBEvhNZziOJ0FEhwglZ1WRvSKDd2+3AH23WMp++btcz/ruhbib2mdUW9nwfQj783Sl+WfJ9Ss6p3RthRtolDxrpSXAIP5KH41jwYvCLPMLBndh5sz3fHuB6AfpbjYgG++pBrhf0rtemj5f1ZtgbvQ5IlYs5L1QUcctA6BbzwlRPbaNvSaM6+hjiU3g7Fm68qmT+4uNBRVKqip0hBkMBJSW8A8ZUSLIvP4G4DDXF\\\\n\\\",\\\"project_public_key\\\":\\\"ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDQxBHbbAyqBKf71sa4+xLV/9gTkZe7kIJgSyU+9ViGqfzN9B0TjBqL4pnZujHUl4Gch4EK9TGg3FtQNWTBHETRMaB4JVrjSpu4uXEYRj3EVVqJKCwwWNOoy4hj7eHmEaAFkw8CVNvBlJAPFXVXUIcPZplQQQI/Da5gUfZ8beGIlrhBWtz2Julw/5sxPiaENm2PPItiw6iZnPZ88/bZCvSHy0Cx2odZE3TJrN3H5Zob/3O09n8wCqPUrvMz9ibKb9z5iT0ANLnKtSCQW1xxIml5JlSFLEPPKFEyCdrE2mTsfPp7Gc+BUD9/KZy+8hih6gfS+dL1kK6OPOVfJLxDcjNJ\\\\n\\\"},\\\"is_init\\\":true,\\\"default_root_user\\\":true,\\\"windows_default_admin_user\\\":true,\\\"telegraf\\\":{\\\"telegraf_conf\\\":\\\"### MANAGED BY ansible-telegraf ANSIBLE ROLE ###\\\\n\\\\n[global_tags]\\\\n\\\\n vm_ip = \\\\\\\"172.16.1.198\\\\\\\"\\\\n vm_name = \\\\\\\"HWSaaS\\\\\\\"\\\\n status = \\\\\\\"start_deploy\\\\\\\"\\\\n tenant = \\\\\\\"system\\\\\\\"\\\\n brand = \\\\\\\"OneCloud\\\\\\\"\\\\n scaling_group_id = \\\\\\\"\\\\\\\"\\\\n project_domain = \\\\\\\"Default\\\\\\\"\\\\n host = \\\\\\\"node9-172-16-1-233\\\\\\\"\\\\n os_type = \\\\\\\"Linux\\\\\\\"\\\\n cloudregion = \\\\\\\"Default\\\\\\\"\\\\n region_ext_id = \\\\\\\"\\\\\\\"\\\\n vm_id = \\\\\\\"14333d0c-6241-4f78-8912-7834ac74d4d7\\\\\\\"\\\\n zone = \\\\\\\"华南-广州\\\\\\\"\\\\n zone_id = \\\\\\\"7b6ae896-1b3d-40e5-879f-cfd00799200b\\\\\\\"\\\\n cloudregion_id = \\\\\\\"default\\\\\\\"\\\\n tenant_id = \\\\\\\"2e152fe0619046a38081d7e487028358\\\\\\\"\\\\n host_id = \\\\\\\"33184fbe-77c0-4aad-8460-f3b27f8648fc\\\\\\\"\\\\n zone_ext_id = \\\\\\\"\\\\\\\"\\\\n domain_id = \\\\\\\"default\\\\\\\"\\\\n\\\\n# Configuration for telegraf agent\\\\n[agent]\\\\n interval = \\\\\\\"60s\\\\\\\"\\\\n debug = false\\\\n hostname = \\\\\\\"\\\\\\\"\\\\n round_interval = true\\\\n flush_interval = \\\\\\\"60s\\\\\\\"\\\\n flush_jitter = \\\\\\\"0s\\\\\\\"\\\\n collection_jitter = \\\\\\\"0s\\\\\\\"\\\\n metric_batch_size = 1000\\\\n metric_buffer_limit = 10000\\\\n quiet = false\\\\n logfile = \\\\\\\"/var/log/telegraf.log\\\\\\\"\\\\n logfile_rotation_max_size = \\\\\\\"10MB\\\\\\\"\\\\n logfile_rotation_max_archives = 1\\\\n omit_hostname = true\\\\n\\\\n###############################################################################\\\\n# OUTPUTS #\\\\n###############################################################################\\\\n\\\\n[[outputs.influxdb]]\\\\n urls = [\\\\\\\"http://169.254.169.254/monitor\\\\\\\"]\\\\n database = \\\\\\\"telegraf\\\\\\\"\\\\n insecure_skip_verify = true\\\\n\\\\n###############################################################################\\\\n# INPUTS #\\\\n###############################################################################\\\\n[[inputs.cpu]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n percpu = true\\\\n totalcpu = true\\\\n collect_cpu_time = false\\\\n report_active = true\\\\n[[inputs.disk]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n ignore_fs = [\\\\\\\"tmpfs\\\\\\\", \\\\\\\"devtmpfs\\\\\\\", \\\\\\\"overlay\\\\\\\", \\\\\\\"squashfs\\\\\\\", \\\\\\\"iso9660\\\\\\\"]\\\\n[[inputs.diskio]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n skip_serial_number = false\\\\n[[inputs.kernel]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.kernel_vmstat]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.mem]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.processes]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.swap]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.system]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.net]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.netstat]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.nstat]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.internal]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n collect_memstats = false\\\\n\\\"}}}'\" error: Process exited with status 2, cmd error: [info 240202 03:34:22 options.parseOptions(options.go:331)] Use configuration file: /opt/yunion/host.conf\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument sdn-enable-tc-man\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument agent-temp-path\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument max-reserved-memory\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument floppy-count\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument host-cpu-passthrough\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument default-live-migrate-downtime\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument windows-default-admin-user\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument use-boot-vga\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument servers-path\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument slots\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument auto-merge-delay-seconds\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument bandwidth-limit\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument default-read-iops-per-cpu\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-kvm\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-cpu-binding\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument local-config-file\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument tunnel-padding-bytes\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument allow-router-vms\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument default-write-iops-per-cpu\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument fetcherfs-block-size\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument migrate-expect-rate\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument check-system-services\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument dhcp-server-port\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument manage-ntp-configuration\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument allow-switch-vms\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-qemu-debug-log\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument fetcherfs-path\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ovn-south-database\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument sdn-enable-guest-man\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-probe-kubelet\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument report-interval\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument sdn-socket-path\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-security-group\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument recycle-diskfile\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument binary-memclean-path\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument min-migrate-timeout-seconds\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument snapshot-dir-suffix\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument snapshot-recycle-day\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-ksm\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument local-backup-storage-path\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-monitor\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument live-migrate-cpu-throttle-max\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument sdn-allow-conntrack-invalid\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument bridge-driver\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument default-request-worker-count\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ovn-eip-bridge\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument pcie-root-port-count\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument default-read-bps-per-cpu\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-openflow-controller\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument cdrom-count\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument sdn-pid-file\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument local-backup-temp-path\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument host-type\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ovn-integration-bridge\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-fallocate-disk\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ethtool-enable-gso\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument sdn-enable-eip-man\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument auto-merge-backing-template\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument host-lease-timeout\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument dhcp-lease-time\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument no-hpet\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument recycle-diskfile-keep-days\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument tap-bridge-name\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument always-recycle-diskfile\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-custom-device\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument block-io-scheduler\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ovmf-path\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument sync-storage-info-duration-second\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument dhcp-renewal-time\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-vm-uuid\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-usb\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-telegraf\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument default-image-save-format\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-gpu\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument local-image-path\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ovn-mapped-bridge\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ovn-underlay-mtu\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-set-cgroup\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument host-health-timeout\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument image-cache-path\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ovn-encap-ip\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument agent-temp-limit\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument rack\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument sdn-enable-tap-man\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-virtio-rng-device\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument max-hotplug-vcpu-count\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument restrict-qemu-img-convert-worker\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-local-vpc\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument linux-default-root-user\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument zero-clean-disk-data\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ping-region-interval\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument etcd-skip-tls-verify\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument set-vnc-password\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-template-backing\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument etcd-use-tls\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument kubelet-run-directory\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument memory-snapshots-path\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument bw-download-bandwidth\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument default-write-bps-per-cpu\n[info 240202 03:34:22 options.parseOptions(options.go:354)] Set log level to \"info\"\n[info 2024-02-02 03:34:22 options.parseOptions(options.go:331)] Use configuration file: /opt/yunion/common.conf\n[warning 2024-02-02 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument default-qemu-version\n[warning 2024-02-02 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument floppy-count\n[warning 2024-02-02 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument cdrom-count\n[warning 2024-02-02 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-local-vpc\n[warning 2024-02-02 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument manage-ntp-configuration\n[warning 2024-02-02 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument no-hpet\n[warning 2024-02-02 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-security-group\n[warning 2024-02-02 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument host-cpu-passthrough\n[warning 2024-02-02 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument live-migrate-cpu-throttle-max\n[info 2024-02-02 03:34:22 options.parseOptions(options.go:354)] Set log level to \"info\"\n[info 2024-02-02 03:34:22 procutils.WaitZombieLoop(zombie_others.go:36)] My pid is not 1 and no need to wait zombies\n[info 2024-02-02 03:34:22 deployserver.(*SDeployService).InitService(deployserver.go:454)] exec socket path: /var/run/onecloud/exec.sock\nfatal error: sync: unlock of unlocked mutex\n\ngoroutine 1 [running]:\nruntime.throw({0x111a4d0?, 0xc000320380?})\n\t/opt/go/src/runtime/panic.go:992 +0x71 fp=0xc000687928 sp=0xc0006878f8 pc=0x4379d1\nsync.throw({0x111a4d0?, 0xf19900?})\n\t/opt/go/src/runtime/panic.go:978 +0x1e fp=0xc000687948 sp=0xc000687928 pc=0x4656de\nsync.(*Mutex).unlockSlow(0xc00034fa70, 0xffffffff)\n\t/opt/go/src/sync/mutex.go:220 +0x3c fp=0xc000687970 sp=0xc000687948 pc=0x474c1c\nsync.(*Mutex).Unlock(...)\n\t/opt/go/src/sync/mutex.go:214\nyunion.io/x/onecloud/pkg/util/xfsutils.UnlockXfsPartition({0xc0007382b1, 0x24})\n\t/root/go/src/yunion.io/x/onecloud/pkg/util/xfsutils/lock.go:48 +0xf4 fp=0xc0006879d0 sp=0xc000687970 pc=0xd1a8d4\nyunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart.(*SKVMGuestDiskPartition).Umount.func1()\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart/kvmpart.go:283 +0x45 fp=0xc0006879f0 sp=0xc0006879d0 pc=0xd27945\nyunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart.(*SKVMGuestDiskPartition).Umount(0xc0003b4960)\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart/kvmpart.go:303 +0x699 fp=0xc000687b88 sp=0xc0006879f0 pc=0xd277d9\nyunion.io/x/onecloud/pkg/hostman/diskutils/qemu_kvm.(*LocalDiskDriver).UmountRootfs(0xc0000bd380?, {0x12c14a0?, 0xc0000103b0?})\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/diskutils/qemu_kvm/local_driver.go:117 +0x3b fp=0xc000687ba0 sp=0xc000687b88 pc=0xe7a7fb\nyunion.io/x/onecloud/pkg/hostman/diskutils/fsutils.DeployGuestfs.func1()\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/diskutils/fsutils/fsutils.go:475 +0x36 fp=0xc000687bc8 sp=0xc000687ba0 pc=0xd20a16\nyunion.io/x/onecloud/pkg/hostman/diskutils/fsutils.DeployGuestfs({0x12bdd48, 0xc000346a50}, 0xc0003580f0)\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/diskutils/fsutils/fsutils.go:485 +0x366 fp=0xc000687c90 sp=0xc000687bc8 pc=0xd20906\nyunion.io/x/onecloud/pkg/hostman/diskutils/qemu_kvm.(*LocalDiskDriver).DeployGuestfs(0xc000346a50?, 0x0?)\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/diskutils/qemu_kvm/local_driver.go:123 +0x26 fp=0xc000687cb8 sp=0xc000687c90 pc=0xe7a866\nyunion.io/x/onecloud/pkg/hostman/diskutils.(*SKVMGuestDisk).DeployGuestfs(...)\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/diskutils/kvm.go:144\nyunion.io/x/onecloud/pkg/hostman/hostdeployer/deployserver.(*LocalDeploy).DeployGuestFs(0xc00070e000?, 0xc0003580f0)\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/hostdeployer/deployserver/localdeploy.go:46 +0x184 fp=0xc000687d88 sp=0xc000687cb8 pc=0xe85c84\nyunion.io/x/onecloud/pkg/hostman/hostdeployer/deployserver.StartLocalDeploy({0x7ffd24a7fcb0?, 0x4?})\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/hostdeployer/deployserver/localdeploy.go:130 +0x2a8 fp=0xc000687de8 sp=0xc000687d88 pc=0xe86dc8\nyunion.io/x/onecloud/pkg/hostman/hostdeployer/deployserver.(*SDeployService).RunService(0xc0002fb2a0?)\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/hostdeployer/deployserver/deployserver.go:266 +0x5b fp=0xc000687ed0 sp=0xc000687de8 pc=0xe83cfb\nyunion.io/x/onecloud/pkg/cloudcommon/service.(*SServiceBase).StartService(0xc00000e6f0)\n\t/root/go/src/yunion.io/x/onecloud/pkg/cloudcommon/service/services.go:58 +0xfa fp=0xc000687f50 sp=0xc000687ed0 pc=0xb70e5a\nmain.main()\n\t/root/go/src/yunion.io/x/onecloud/cmd/host-deployer/main.go:28 +0xe5 fp=0xc000687f80 sp=0xc000687f50 pc=0xe87625\nruntime.main()\n\t/opt/go/src/runtime/proc.go:250 +0x212 fp=0xc000687fe0 sp=0xc000687f80 pc=0x43a0f2\nruntime.goexit()\n\t/opt/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc000687fe8 sp=0xc000687fe0 pc=0x46aa61\n\ngoroutine 9 [syscall]:\nos/signal.signal_recv()\n\t/opt/go/src/runtime/sigqueue.go:151 +0x2f\nos/signal.loop()\n\t/opt/go/src/os/signal/signal_unix.go:23 +0x19\ncreated by os/signal.Notify.func1.1\n\t/opt/go/src/os/signal/signal.go:151 +0x2a\n\ngoroutine 10 [chan receive]:\nyunion.io/x/pkg/util/signalutils.StartTrap.func1()\n\t/root/go/src/yunion.io/x/onecloud/vendor/yunion.io/x/pkg/util/signalutils/signalutils.go:72 +0xa7\ncreated by yunion.io/x/pkg/util/signalutils.StartTrap\n\t/root/go/src/yunion.io/x/onecloud/vendor/yunion.io/x/pkg/util/signalutils/signalutils.go:62 +0xd4\n\ngoroutine 11 [chan send]:\nyunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart.(*SKVMGuestDiskPartition).mount.func2.1()\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart/kvmpart.go:189 +0x24b\ncreated by yunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart.(*SKVMGuestDiskPartition).mount.func2\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart/kvmpart.go:186 +0xe5\n\ngoroutine 28 [chan send]:\nyunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart.(*SKVMGuestDiskPartition).mount.func2.1()\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart/kvmpart.go:193 +0x238\ncreated by yunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart.(*SKVMGuestDiskPartition).mount.func2\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart/kvmpart.go:186 +0xe5\n\ngoroutine 30 [chan send]:\nyunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart.(*SKVMGuestDiskPartition).mount.func2.1()\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart/kvmpart.go:193 +0x238\ncreated by yunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart.(*SKVMGuestDiskPartition).mount.func2\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart/kvmpart.go:186 +0xe5\n",
"__stage__": "OnDeployGuestComplete",
"__status__": "error"
}
用新镜像来创新虚拟机,同样的问题
{ "__reason__": "Deploy guest fs: request deploy guest fs: rpc error: code = Unknown desc = run deploy_guest_fs failed []: \"/opt/yunion/bin/host-deployer --common-config-file /opt/yunion/common.conf --config /opt/yunion/host.conf --deploy-action deploy_guest_fs --deploy-params '{\\\"disk_info\\\":{\\\"path\\\":\\\"rbd:nvmepool/9c380d1a-dcf8-443c-8646-9bf67a592158:mon_host=172.16.1.216\\\\\\\\;172.16.1.218\\\\\\\\;172.16.1.217:key=AQBz5pZlFX41OBAAJqPwV73/Zxc0nKEjdGb0uw\\\\\\\\=\\\\\\\\=:rados_mon_op_timeout=5:rados_osd_op_timeout=1200:client_mount_timeout=120\\\"},\\\"guest_desc\\\":{\\\"name\\\":\\\"HWSaaS\\\",\\\"uuid\\\":\\\"14333d0c-6241-4f78-8912-7834ac74d4d7\\\",\\\"domain\\\":\\\"cloud.onecloud.io\\\",\\\"nics\\\":[{\\\"mac\\\":\\\"00:22:31:ad:35:2d\\\",\\\"ip\\\":\\\"172.16.1.198\\\",\\\"net\\\":\\\"vm-static-net\\\",\\\"net_id\\\":\\\"e532366c-2ba4-4fed-895b-efd402812149\\\",\\\"gateway\\\":\\\"172.16.1.1\\\",\\\"dns\\\":\\\"172.16.1.200\\\",\\\"domain\\\":\\\"cloud.onecloud.io\\\",\\\"ifname\\\":\\\"dhcp1-dsf\\\",\\\"masklen\\\":24,\\\"driver\\\":\\\"virtio\\\",\\\"bridge\\\":\\\"br1\\\",\\\"wire_id\\\":\\\"2a4e5367-e4c5-4410-81dd-217698d99ff2\\\",\\\"vlan\\\":1,\\\"interface\\\":\\\"bond0\\\",\\\"bw\\\":1000,\\\"mtu\\\":1500}],\\\"disks\\\":[{\\\"disk_id\\\":\\\"9c380d1a-dcf8-443c-8646-9bf67a592158\\\",\\\"driver\\\":\\\"scsi\\\",\\\"cache_mode\\\":\\\"none\\\",\\\"aio_mode\\\":\\\"native\\\",\\\"size\\\":102400,\\\"template_id\\\":\\\"d4d0b10b-89d3-49c4-88c8-d3528312d5c1\\\",\\\"storage_id\\\":\\\"1b298235-a82f-4579-8b7a-e6dd2d9916d3\\\",\\\"path\\\":\\\"rbd:nvmepool/9c380d1a-dcf8-443c-8646-9bf67a592158\\\",\\\"format\\\":\\\"raw\\\"}],\\\"Hypervisor\\\":\\\"kvm\\\",\\\"hostname\\\":\\\"HWSaaS\\\"},\\\"deploy_info\\\":{\\\"public_key\\\":{\\\"admin_public_key\\\":\\\"ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDG7U+zsDlTXjbDWg4/C0NElAGPJ2CXrs8dh89ftJFjPbB5W9ghrVoen4UTBBm6GqXc4hl5zGVM2zL2H31n85HfYgBo47uKFEKu9c4DpSdiTBf15zBEvhNZziOJ0FEhwglZ1WRvSKDd2+3AH23WMp++btcz/ruhbib2mdUW9nwfQj783Sl+WfJ9Ss6p3RthRtolDxrpSXAIP5KH41jwYvCLPMLBndh5sz3fHuB6AfpbjYgG++pBrhf0rtemj5f1ZtgbvQ5IlYs5L1QUcctA6BbzwlRPbaNvSaM6+hjiU3g7Fm68qmT+4uNBRVKqip0hBkMBJSW8A8ZUSLIvP4G4DDXF\\\\n\\\",\\\"project_public_key\\\":\\\"ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDQxBHbbAyqBKf71sa4+xLV/9gTkZe7kIJgSyU+9ViGqfzN9B0TjBqL4pnZujHUl4Gch4EK9TGg3FtQNWTBHETRMaB4JVrjSpu4uXEYRj3EVVqJKCwwWNOoy4hj7eHmEaAFkw8CVNvBlJAPFXVXUIcPZplQQQI/Da5gUfZ8beGIlrhBWtz2Julw/5sxPiaENm2PPItiw6iZnPZ88/bZCvSHy0Cx2odZE3TJrN3H5Zob/3O09n8wCqPUrvMz9ibKb9z5iT0ANLnKtSCQW1xxIml5JlSFLEPPKFEyCdrE2mTsfPp7Gc+BUD9/KZy+8hih6gfS+dL1kK6OPOVfJLxDcjNJ\\\\n\\\"},\\\"is_init\\\":true,\\\"default_root_user\\\":true,\\\"windows_default_admin_user\\\":true,\\\"telegraf\\\":{\\\"telegraf_conf\\\":\\\"### MANAGED BY ansible-telegraf ANSIBLE ROLE ###\\\\n\\\\n[global_tags]\\\\n\\\\n vm_ip = \\\\\\\"172.16.1.198\\\\\\\"\\\\n vm_name = \\\\\\\"HWSaaS\\\\\\\"\\\\n status = \\\\\\\"start_deploy\\\\\\\"\\\\n tenant = \\\\\\\"system\\\\\\\"\\\\n brand = \\\\\\\"OneCloud\\\\\\\"\\\\n scaling_group_id = \\\\\\\"\\\\\\\"\\\\n project_domain = \\\\\\\"Default\\\\\\\"\\\\n host = \\\\\\\"node9-172-16-1-233\\\\\\\"\\\\n os_type = \\\\\\\"Linux\\\\\\\"\\\\n cloudregion = \\\\\\\"Default\\\\\\\"\\\\n region_ext_id = \\\\\\\"\\\\\\\"\\\\n vm_id = \\\\\\\"14333d0c-6241-4f78-8912-7834ac74d4d7\\\\\\\"\\\\n zone = \\\\\\\"华南-广州\\\\\\\"\\\\n zone_id = \\\\\\\"7b6ae896-1b3d-40e5-879f-cfd00799200b\\\\\\\"\\\\n cloudregion_id = \\\\\\\"default\\\\\\\"\\\\n tenant_id = \\\\\\\"2e152fe0619046a38081d7e487028358\\\\\\\"\\\\n host_id = \\\\\\\"33184fbe-77c0-4aad-8460-f3b27f8648fc\\\\\\\"\\\\n zone_ext_id = \\\\\\\"\\\\\\\"\\\\n domain_id = \\\\\\\"default\\\\\\\"\\\\n\\\\n# Configuration for telegraf agent\\\\n[agent]\\\\n interval = \\\\\\\"60s\\\\\\\"\\\\n debug = false\\\\n hostname = \\\\\\\"\\\\\\\"\\\\n round_interval = true\\\\n flush_interval = \\\\\\\"60s\\\\\\\"\\\\n flush_jitter = \\\\\\\"0s\\\\\\\"\\\\n collection_jitter = \\\\\\\"0s\\\\\\\"\\\\n metric_batch_size = 1000\\\\n metric_buffer_limit = 10000\\\\n quiet = false\\\\n logfile = \\\\\\\"/var/log/telegraf.log\\\\\\\"\\\\n logfile_rotation_max_size = \\\\\\\"10MB\\\\\\\"\\\\n logfile_rotation_max_archives = 1\\\\n omit_hostname = true\\\\n\\\\n###############################################################################\\\\n# OUTPUTS #\\\\n###############################################################################\\\\n\\\\n[[outputs.influxdb]]\\\\n urls = [\\\\\\\"http://169.254.169.254/monitor\\\\\\\"]\\\\n database = \\\\\\\"telegraf\\\\\\\"\\\\n insecure_skip_verify = true\\\\n\\\\n###############################################################################\\\\n# INPUTS #\\\\n###############################################################################\\\\n[[inputs.cpu]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n percpu = true\\\\n totalcpu = true\\\\n collect_cpu_time = false\\\\n report_active = true\\\\n[[inputs.disk]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n ignore_fs = [\\\\\\\"tmpfs\\\\\\\", \\\\\\\"devtmpfs\\\\\\\", \\\\\\\"overlay\\\\\\\", \\\\\\\"squashfs\\\\\\\", \\\\\\\"iso9660\\\\\\\"]\\\\n[[inputs.diskio]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n skip_serial_number = false\\\\n[[inputs.kernel]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.kernel_vmstat]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.mem]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.processes]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.swap]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.system]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.net]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.netstat]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.nstat]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n[[inputs.internal]]\\\\n name_prefix = \\\\\\\"agent_\\\\\\\"\\\\n collect_memstats = false\\\\n\\\"}}}'\" error: Process exited with status 2, cmd error: [info 240202 03:34:22 options.parseOptions(options.go:331)] Use configuration file: /opt/yunion/host.conf\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument sdn-enable-tc-man\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument agent-temp-path\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument max-reserved-memory\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument floppy-count\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument host-cpu-passthrough\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument default-live-migrate-downtime\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument windows-default-admin-user\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument use-boot-vga\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument servers-path\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument slots\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument auto-merge-delay-seconds\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument bandwidth-limit\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument default-read-iops-per-cpu\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-kvm\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-cpu-binding\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument local-config-file\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument tunnel-padding-bytes\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument allow-router-vms\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument default-write-iops-per-cpu\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument fetcherfs-block-size\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument migrate-expect-rate\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument check-system-services\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument dhcp-server-port\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument manage-ntp-configuration\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument allow-switch-vms\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-qemu-debug-log\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument fetcherfs-path\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ovn-south-database\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument sdn-enable-guest-man\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-probe-kubelet\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument report-interval\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument sdn-socket-path\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-security-group\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument recycle-diskfile\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument binary-memclean-path\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument min-migrate-timeout-seconds\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument snapshot-dir-suffix\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument snapshot-recycle-day\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-ksm\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument local-backup-storage-path\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-monitor\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument live-migrate-cpu-throttle-max\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument sdn-allow-conntrack-invalid\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument bridge-driver\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument default-request-worker-count\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ovn-eip-bridge\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument pcie-root-port-count\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument default-read-bps-per-cpu\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-openflow-controller\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument cdrom-count\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument sdn-pid-file\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument local-backup-temp-path\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument host-type\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ovn-integration-bridge\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-fallocate-disk\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ethtool-enable-gso\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument sdn-enable-eip-man\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument auto-merge-backing-template\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument host-lease-timeout\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument dhcp-lease-time\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument no-hpet\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument recycle-diskfile-keep-days\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument tap-bridge-name\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument always-recycle-diskfile\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-custom-device\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument block-io-scheduler\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ovmf-path\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument sync-storage-info-duration-second\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument dhcp-renewal-time\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-vm-uuid\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-usb\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-telegraf\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument default-image-save-format\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-gpu\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument local-image-path\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ovn-mapped-bridge\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ovn-underlay-mtu\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-set-cgroup\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument host-health-timeout\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument image-cache-path\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ovn-encap-ip\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument agent-temp-limit\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument rack\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument sdn-enable-tap-man\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-virtio-rng-device\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument max-hotplug-vcpu-count\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument restrict-qemu-img-convert-worker\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-local-vpc\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument linux-default-root-user\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument zero-clean-disk-data\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument ping-region-interval\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument etcd-skip-tls-verify\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument set-vnc-password\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument enable-template-backing\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument etcd-use-tls\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument kubelet-run-directory\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument memory-snapshots-path\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument bw-download-bandwidth\n[warning 240202 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument default-write-bps-per-cpu\n[info 240202 03:34:22 options.parseOptions(options.go:354)] Set log level to \"info\"\n[info 2024-02-02 03:34:22 options.parseOptions(options.go:331)] Use configuration file: /opt/yunion/common.conf\n[warning 2024-02-02 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument default-qemu-version\n[warning 2024-02-02 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument floppy-count\n[warning 2024-02-02 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument cdrom-count\n[warning 2024-02-02 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-local-vpc\n[warning 2024-02-02 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument manage-ntp-configuration\n[warning 2024-02-02 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument no-hpet\n[warning 2024-02-02 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument disable-security-group\n[warning 2024-02-02 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument host-cpu-passthrough\n[warning 2024-02-02 03:34:22 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1215)] Cannot find argument live-migrate-cpu-throttle-max\n[info 2024-02-02 03:34:22 options.parseOptions(options.go:354)] Set log level to \"info\"\n[info 2024-02-02 03:34:22 procutils.WaitZombieLoop(zombie_others.go:36)] My pid is not 1 and no need to wait zombies\n[info 2024-02-02 03:34:22 deployserver.(*SDeployService).InitService(deployserver.go:454)] exec socket path: /var/run/onecloud/exec.sock\nfatal error: sync: unlock of unlocked mutex\n\ngoroutine 1 [running]:\nruntime.throw({0x111a4d0?, 0xc000320380?})\n\t/opt/go/src/runtime/panic.go:992 +0x71 fp=0xc000687928 sp=0xc0006878f8 pc=0x4379d1\nsync.throw({0x111a4d0?, 0xf19900?})\n\t/opt/go/src/runtime/panic.go:978 +0x1e fp=0xc000687948 sp=0xc000687928 pc=0x4656de\nsync.(*Mutex).unlockSlow(0xc00034fa70, 0xffffffff)\n\t/opt/go/src/sync/mutex.go:220 +0x3c fp=0xc000687970 sp=0xc000687948 pc=0x474c1c\nsync.(*Mutex).Unlock(...)\n\t/opt/go/src/sync/mutex.go:214\nyunion.io/x/onecloud/pkg/util/xfsutils.UnlockXfsPartition({0xc0007382b1, 0x24})\n\t/root/go/src/yunion.io/x/onecloud/pkg/util/xfsutils/lock.go:48 +0xf4 fp=0xc0006879d0 sp=0xc000687970 pc=0xd1a8d4\nyunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart.(*SKVMGuestDiskPartition).Umount.func1()\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart/kvmpart.go:283 +0x45 fp=0xc0006879f0 sp=0xc0006879d0 pc=0xd27945\nyunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart.(*SKVMGuestDiskPartition).Umount(0xc0003b4960)\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart/kvmpart.go:303 +0x699 fp=0xc000687b88 sp=0xc0006879f0 pc=0xd277d9\nyunion.io/x/onecloud/pkg/hostman/diskutils/qemu_kvm.(*LocalDiskDriver).UmountRootfs(0xc0000bd380?, {0x12c14a0?, 0xc0000103b0?})\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/diskutils/qemu_kvm/local_driver.go:117 +0x3b fp=0xc000687ba0 sp=0xc000687b88 pc=0xe7a7fb\nyunion.io/x/onecloud/pkg/hostman/diskutils/fsutils.DeployGuestfs.func1()\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/diskutils/fsutils/fsutils.go:475 +0x36 fp=0xc000687bc8 sp=0xc000687ba0 pc=0xd20a16\nyunion.io/x/onecloud/pkg/hostman/diskutils/fsutils.DeployGuestfs({0x12bdd48, 0xc000346a50}, 0xc0003580f0)\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/diskutils/fsutils/fsutils.go:485 +0x366 fp=0xc000687c90 sp=0xc000687bc8 pc=0xd20906\nyunion.io/x/onecloud/pkg/hostman/diskutils/qemu_kvm.(*LocalDiskDriver).DeployGuestfs(0xc000346a50?, 0x0?)\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/diskutils/qemu_kvm/local_driver.go:123 +0x26 fp=0xc000687cb8 sp=0xc000687c90 pc=0xe7a866\nyunion.io/x/onecloud/pkg/hostman/diskutils.(*SKVMGuestDisk).DeployGuestfs(...)\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/diskutils/kvm.go:144\nyunion.io/x/onecloud/pkg/hostman/hostdeployer/deployserver.(*LocalDeploy).DeployGuestFs(0xc00070e000?, 0xc0003580f0)\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/hostdeployer/deployserver/localdeploy.go:46 +0x184 fp=0xc000687d88 sp=0xc000687cb8 pc=0xe85c84\nyunion.io/x/onecloud/pkg/hostman/hostdeployer/deployserver.StartLocalDeploy({0x7ffd24a7fcb0?, 0x4?})\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/hostdeployer/deployserver/localdeploy.go:130 +0x2a8 fp=0xc000687de8 sp=0xc000687d88 pc=0xe86dc8\nyunion.io/x/onecloud/pkg/hostman/hostdeployer/deployserver.(*SDeployService).RunService(0xc0002fb2a0?)\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/hostdeployer/deployserver/deployserver.go:266 +0x5b fp=0xc000687ed0 sp=0xc000687de8 pc=0xe83cfb\nyunion.io/x/onecloud/pkg/cloudcommon/service.(*SServiceBase).StartService(0xc00000e6f0)\n\t/root/go/src/yunion.io/x/onecloud/pkg/cloudcommon/service/services.go:58 +0xfa fp=0xc000687f50 sp=0xc000687ed0 pc=0xb70e5a\nmain.main()\n\t/root/go/src/yunion.io/x/onecloud/cmd/host-deployer/main.go:28 +0xe5 fp=0xc000687f80 sp=0xc000687f50 pc=0xe87625\nruntime.main()\n\t/opt/go/src/runtime/proc.go:250 +0x212 fp=0xc000687fe0 sp=0xc000687f80 pc=0x43a0f2\nruntime.goexit()\n\t/opt/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc000687fe8 sp=0xc000687fe0 pc=0x46aa61\n\ngoroutine 9 [syscall]:\nos/signal.signal_recv()\n\t/opt/go/src/runtime/sigqueue.go:151 +0x2f\nos/signal.loop()\n\t/opt/go/src/os/signal/signal_unix.go:23 +0x19\ncreated by os/signal.Notify.func1.1\n\t/opt/go/src/os/signal/signal.go:151 +0x2a\n\ngoroutine 10 [chan receive]:\nyunion.io/x/pkg/util/signalutils.StartTrap.func1()\n\t/root/go/src/yunion.io/x/onecloud/vendor/yunion.io/x/pkg/util/signalutils/signalutils.go:72 +0xa7\ncreated by yunion.io/x/pkg/util/signalutils.StartTrap\n\t/root/go/src/yunion.io/x/onecloud/vendor/yunion.io/x/pkg/util/signalutils/signalutils.go:62 +0xd4\n\ngoroutine 11 [chan send]:\nyunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart.(*SKVMGuestDiskPartition).mount.func2.1()\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart/kvmpart.go:189 +0x24b\ncreated by yunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart.(*SKVMGuestDiskPartition).mount.func2\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart/kvmpart.go:186 +0xe5\n\ngoroutine 28 [chan send]:\nyunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart.(*SKVMGuestDiskPartition).mount.func2.1()\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart/kvmpart.go:193 +0x238\ncreated by yunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart.(*SKVMGuestDiskPartition).mount.func2\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart/kvmpart.go:186 +0xe5\n\ngoroutine 30 [chan send]:\nyunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart.(*SKVMGuestDiskPartition).mount.func2.1()\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart/kvmpart.go:193 +0x238\ncreated by yunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart.(*SKVMGuestDiskPartition).mount.func2\n\t/root/go/src/yunion.io/x/onecloud/pkg/hostman/guestfs/kvmpart/kvmpart.go:186 +0xe5\n", "__stage__": "OnDeployGuestComplete", "__status__": "error" }
@chenjacken 目前确实发现xfs 镜像部署有问题,修复后会再这里通知
好的,谢谢。v3.10.12也会存在这个问题吗?
好的,谢谢。v3.10.12也会存在这个问题吗?
@chenjacken 是的,这个问题是最新发现的
发现对应的host-pod会有以下的报错信息:
[warning 2024-02-02 06:41:49 appsrv.do_worker_watchdog(workers_watchdog.go:64)] WorkerManager RequestWorker has been busy for 249 cycles...
[warning 2024-02-02 06:42:19 appsrv.do_worker_watchdog(workers_watchdog.go:64)] WorkerManager RequestWorker has been busy for 250 cycles...
[warning 2024-02-02 06:42:49 appsrv.do_worker_watchdog(workers_watchdog.go:64)] WorkerManager RequestWorker has been busy for 251 cycles...
[warning 2024-02-02 06:43:19 appsrv.do_worker_watchdog(workers_watchdog.go:64)] WorkerManager RequestWorker has been busy for 252 cycles...
[warning 2024-02-02 06:43:49 appsrv.do_worker_watchdog(workers_watchdog.go:64)] WorkerManager RequestWorker has been busy for 253 cycles...
[error 2024-02-02 06:44:06 storageman.(*SRbdStorage).SaveToGlance(storage_rbd.go:485)] Save to glance failed: {"error":{"class":"TimeoutError","code":504,"details":"request process timeout","request":{"headers":{"Content-Length":"18977849344","Content-Type":"application/octet-stream","User-Agent":"yunioncloud-go/201708","X-Auth-Token":"*","X-Image-Meta-Image_id":"66d45f68-eb72-4e4e-89a2-938b2904a203","X-Yunion-Parent-Id":"","X-Yunion-Peer-Service-Name":"host","X-Yunion-Remote-Addr":"default-glance:30292","X-Yunion-Span-Id":"0","X-Yunion-Span-Name":"","X-Yunion-Strace-Debug":"true","X-Yunion-Strace-Id":"e368273d"},"method":"PUT","url":"https://default-glance:30292/v1/images/66d45f68-eb72-4e4e-89a2-938b2904a203"}}}
[info 2024-02-02 06:44:07 hostdhcp.(*SGuestDHCPServer).serveDHCPInternal(dhcpserver.go:278)] Make DHCP Reply 172.16.1.195 TO 00:22:f0:d0:47:c8
[error 2024-02-02 06:44:08 storageman.(*SRbdStorage).SaveToGlance(storage_rbd.go:492)] Fail to remote cache image: {"error":{"class":"UnclassifiedError","code":500,"details":"sql: no rows in result set","request":{"body":"{\"storagecachedimage\":{\"path\":\"rbd:nvmepool/image_cache_66d45f68-eb72-4e4e-89a2-938b2904a203:mon_hos...=120\",\"status\":\"active\"}}","headers":{"Content-Length":"288","Content-Type":"application/json","User-Agent":"yunioncloud-go/201708","X-Auth-Token":"*","X-Yunion-Parent-Id":"","X-Yunion-Peer-Service-Name":"host","X-Yunion-Remote-Addr":"default-region:30888","X-Yunion-Span-Id":"0","X-Yunion-Span-Name":"","X-Yunion-Strace-Debug":"true","X-Yunion-Strace-Id":"721570cd"},"method":"PUT","url":"https://default-region:30888/storagecaches/ef9df16d-2111-44c2-8989-72a35b8fa0d6/cachedimages/66d45f68-eb72-4e4e-89a2-938b2904a203?auto_create=true"}}}
[info 2024-02-02 06:44:08 workmanager.(*workerTask).Run(manager.go:95)] DelayTask complete: <nil>
[info 2024-02-02 06:44:08 modules.TaskComplete(task.go:34)] Sync task 6c52864a-8bb7-4b75-88dc-086bef358237 complete succ
[error 2024-02-02 06:46:40 storageman.GatherHostStorageStats(core.go:454)] sync storage Ceph-SSD size failed: GetCapacity: output: stderr "": signal: killed
另外想咨询下,安装 Cloudbase-Init,配置的metadata地址是如下的吗?
metadata_services=cloudbaseinit.metadata.services.configdrive.ConfigDriveService,cloudbaseinit.metadata.services.ec2service.EC2Service
metadata_base_url=http://169.254.169.254/
ec2_metadata_base_url=http://169.254.169.254/
Cloudbase-Init的日志:
2024-02-02 11:00:50.959 4664 ERROR cloudbaseinit.metadata.services.base File "C:\Program Files\Cloudbase Solutions\Cloudbase-Init\Python\lib\site-packages\requests\sessions.py", line 701, in send
2024-02-02 11:00:50.959 4664 ERROR cloudbaseinit.metadata.services.base r = adapter.send(request, **kwargs)
2024-02-02 11:00:50.959 4664 ERROR cloudbaseinit.metadata.services.base File "C:\Program Files\Cloudbase Solutions\Cloudbase-Init\Python\lib\site-packages\requests\adapters.py", line 553, in send
2024-02-02 11:00:50.959 4664 ERROR cloudbaseinit.metadata.services.base raise ConnectTimeout(e, request=request)
2024-02-02 11:00:50.959 4664 ERROR cloudbaseinit.metadata.services.base requests.exceptions.ConnectTimeout: HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /2009-04-04/meta-data/local-hostname (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x0000018A857E06D0>, 'Connection to 169.254.169.254 timed out. (connect timeout=None)'))
2024-02-02 11:00:50.959 4664 ERROR cloudbaseinit.metadata.services.base
2024-02-02 11:00:50.963 4664 ERROR cloudbaseinit.metadata.services.ec2service [-] HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /2009-04-04/meta-data/local-hostname (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x0000018A857E06D0>, 'Connection to 169.254.169.254 timed out. (connect timeout=None)')): requests.exceptions.ConnectTimeout: HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /2009-04-04/meta-data/local-hostname (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x0000018A857E06D0>, 'Connection to 169.254.169.254 timed out. (connect timeout=None)'))
2024-02-02 11:00:50.963 4664 ERROR cloudbaseinit.metadata.services.ec2service Traceback (most recent call last):
2024-02-02 11:00:50.963 4664 ERROR cloudbaseinit.metadata.services.ec2service File "C:\Program Files\Cloudbase Solutions\Cloudbase-Init\Python\lib\site-packages\urllib3\connection.py", line 174, in _new_conn
2024-02-02 11:00:50.963 4664 ERROR cloudbaseinit.metadata.services.ec2service conn = connection.create_connection(
2024-02-02 11:00:50.963 4664 ERROR cloudbaseinit.metadata.services.ec2service File "C:\Program Files\Cloudbase Solutions\Cloudbase-Init\Python\lib\site-packages\urllib3\util\connection.py", line 95, in create_connection
2024-02-02 11:00:50.963 4664 ERROR cloudbaseinit.metadata.services.ec2service raise err
2024-02-02 11:00:50.963 4664 ERROR cloudbaseinit.metadata.services.ec2service File "C:\Program Files\Cloudbase Solutions\Cloudbase-Init\Python\lib\site-packages\urllib3\util\connection.py", line 85, in create_connection
2024-02-02 11:00:50.963 4664 ERROR cloudbaseinit.metadata.services.ec2service sock.connect(sa)
2024-02-02 11:00:50.963 4664 ERROR cloudbaseinit.metadata.services.ec2service TimeoutError: [WinError 10060] 由于连接方在一段时间后没有正确答复或连接的主机没有反应,连接尝试失败。
2024-02-02 11:00:50.963 4664 ERROR cloudbaseinit.metadata.services.ec2service
2024-02-02 11:00:50.963 4664 ERROR cloudbaseinit.metadata.services.ec2service During handling of the above exception, another exception occurred:
...
2024-02-02 11:00:50.963 4664 ERROR cloudbaseinit.metadata.services.ec2service requests.exceptions.ConnectTimeout: HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /2009-04-04/meta-data/local-hostname (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x0000018A857E06D0>, 'Connection to 169.254.169.254 timed out. (connect timeout=None)'))
2024-02-02 11:00:50.963 4664 ERROR cloudbaseinit.metadata.services.ec2service
2024-02-02 11:00:50.966 4664 DEBUG cloudbaseinit.metadata.services.ec2service [-] Metadata not found at URL 'http://169.254.169.254/' load C:\Program Files\Cloudbase Solutions\Cloudbase-Init\Python\lib\site-packages\cloudbaseinit\metadata\services\ec2service.py:46
2024-02-02 11:00:50.967 4664 ERROR cloudbaseinit.init [-] No metadata service found: cloudbaseinit.exception.MetadataNotFoundException: No available service found
@chenjacken 是这个地址,虚机的网络通吗,vpc网络 还是经典网络
虚拟机状态未知 对应的磁盘也是未知
同时发现,我的虚拟机是在node7节点,磁盘未知错误的信息显示对应的地址是node9(ip是172.16.1.233)
{
"__reason__": "{\"error\":{\"class\":\"ClientError\",\"code\":499,\"details\":\"Get \\\"https://172.16.1.233:8885/disks/1b298235-a82f-4579-8b7a-e6dd2d9916d3/6f4b9da3-977c-45a1-8a78-e605d87b8adf/status\\\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)\",\"request\":{\"headers\":{\"User-Agent\":\"yunioncloud-go/201708\",\"X-Auth-Token\":\"*\",\"X-Region-Version\":\"v2\",\"X-Request-Id\":\"180688-8c5614\",\"X-Task-Id\":\"38f00d85-6aa1-494e-8d4d-bc5b3c41f451\",\"X-Task-Notify-Url\":\"https://default-region:30888/tasks/38f00d85-6aa1-494e-8d4d-bc5b3c41f451\",\"X-Yunion-Parent-Id\":\"0.0\",\"X-Yunion-Peer-Service-Name\":\"compute_v2\",\"X-Yunion-Remote-Addr\":\"172.16.1.233:8885\",\"X-Yunion-Span-Id\":\"0.0.0\",\"X-Yunion-Span-Name\":\"\",\"X-Yunion-Strace-Debug\":\"true\",\"X-Yunion-Strace-Id\":\"d5e74afa\"},\"method\":\"GET\",\"url\":\"https://172.16.1.233:8885/disks/1b298235-a82f-4579-8b7a-e6dd2d9916d3/6f4b9da3-977c-45a1-8a78-e605d87b8adf/status\"}}}",
"__stage__": "OnDiskSyncStatusComplete",
"__status__": "ERROR"
}
以下是node9的pod日志
发现default-host的日志是这样:
[error 2024-02-04 01:43:39 storageman.GatherHostStorageStats(core.go:454)] sync storage Ceph-SSD size failed: GetCapacity: output: stderr "": signal: killed
[info 2024-02-04 01:43:53 appsrv.(*Application).ServeHTTP(appsrv.go:288)] 1Dy_ML5U1CU3_77NQE_4CAIeFiA= 200 bab40c-68a47f-d9bc6b GET /disks/1b298235-a82f-4579-8b7a-e6dd2d9916d3/6f4b9da3-977c-45a1-8a78-e605d87b8adf/status (172.16.1.213:54833:compute_v2) 15081.15ms
[error 2024-02-04 01:43:54 storageman.GatherHostStorageStats(core.go:454)] sync storage Ceph-HDD size failed: GetCapacity: output: stderr "": signal: killed
[error 2024-02-04 01:45:09 storageman.GatherHostStorageStats(core.go:454)] sync storage Ceph-SSD size failed: GetCapacity: output: stderr "": signal: killed
[error 2024-02-04 01:45:24 storageman.GatherHostStorageStats(core.go:454)] sync storage Ceph-HDD size failed: GetCapacity: output: stderr "": signal: killed
[error 2024-02-04 01:46:39 storageman.GatherHostStorageStats(core.go:454)] sync storage Ceph-SSD size failed: GetCapacity: output: stderr "": signal: killed
[info 2024-02-04 01:46:43 appsrv.(*Application).ServeHTTP(appsrv.go:288)] 1Dy_ML5U1CU3_77NQE_4CAIeFiA= 200 70ba9b-5394a3-abe9bb GET /disks/1b298235-a82f-4579-8b7a-e6dd2d9916d3/6f4b9da3-977c-45a1-8a78-e605d87b8adf/status (172.16.1.213:5294:compute_v2) 15006.18ms
[error 2024-02-04 01:46:54 storageman.GatherHostStorageStats(core.go:454)] sync storage Ceph-HDD size failed: GetCapacity: output: stderr "": signal: killed
[error 2024-02-04 01:48:09 storageman.GatherHostStorageStats(core.go:454)] sync storage Ceph-SSD size failed: GetCapacity: output: stderr "": signal: killed
[error 2024-02-04 01:48:24 storageman.GatherHostStorageStats(core.go:454)] sync storage Ceph-HDD size failed: GetCapacity: output: stderr "": signal: killed
[error 2024-02-04 01:49:39 storageman.GatherHostStorageStats(core.go:454)] sync storage Ceph-SSD size failed: GetCapacity: output: stderr "": signal: killed
[error 2024-02-04 01:49:54 storageman.GatherHostStorageStats(core.go:454)] sync storage Ceph-HDD size failed: GetCapacity: output: stderr "": signal: killed
[info 2024-02-04 01:50:33 appsrv.(*Application).ServeHTTP(appsrv.go:288)] 1Dy_ML5U1CU3_77NQE_4CAIeFiA= 200 f2f6ec-ad3d15-1fdca6 GET /disks/1b298235-a82f-4579-8b7a-e6dd2d9916d3/37cb2e83-eba2-4f52-86cb-722849296034/status (172.16.1.213:15363:compute_v2) 15005.53ms
[info 2024-02-04 01:50:48 appsrv.(*Application).ServeHTTP(appsrv.go:288)] 1Dy_ML5U1CU3_77NQE_4CAIeFiA= 200 401181-fe7f38-3f1291 GET /disks/1b298235-a82f-4579-8b7a-e6dd2d9916d3/7a4be1cf-3a81-47b4-80e2-02378388d2ba/status (172.16.1.213:8115:compute_v2) 27147.93ms
[error 2024-02-04 01:51:09 storageman.GatherHostStorageStats(core.go:454)] sync storage Ceph-SSD size failed: GetCapacity: output: stderr "": signal: killed
[error 2024-02-04 01:51:24 storageman.GatherHostStorageStats(core.go:454)] sync storage Ceph-HDD size failed: GetCapacity: output: stderr "": signal: killed
[error 2024-02-04 01:52:39 storageman.GatherHostStorageStats(core.go:454)] sync storage Ceph-SSD size failed: GetCapacity: output: stderr "": signal: killed
[error 2024-02-04 01:52:54 storageman.GatherHostStorageStats(core.go:454)] sync storage Ceph-HDD size failed: GetCapacity: output: stderr "": signal: killed
[info 2024-02-04 01:53:29 appsrv.(*Application).ServeHTTP(appsrv.go:288)] 1Dy_ML5U1CU3_77NQE_4CAIeFiA= 200 f70c7f-fe8d60-3594ca GET /disks/1b298235-a82f-4579-8b7a-e6dd2d9916d3/37cb2e83-eba2-4f52-86cb-722849296034/status (172.16.1.213:21241:compute_v2) 15005.69ms
[info 2024-02-04 01:53:44 appsrv.(*Application).ServeHTTP(appsrv.go:288)] 1Dy_ML5U1CU3_77NQE_4CAIeFiA= 200 3d7dd8-ddf644-a1639f GET /disks/1b298235-a82f-4579-8b7a-e6dd2d9916d3/7a4be1cf-3a81-47b4-80e2-02378388d2ba/status (172.16.1.213:59919:compute_v2) 25330.46ms
[error 2024-02-04 01:54:10 storageman.GatherHostStorageStats(core.go:454)] sync storage Ceph-SSD size failed: GetCapacity: output: stderr "": signal: killed
[error 2024-02-04 01:54:25 storageman.GatherHostStorageStats(core.go:454)] sync storage Ceph-HDD size failed: GetCapacity: output: stderr "": signal: killed
[info 2024-02-04 01:54:59 appsrv.(*Application).ServeHTTP(appsrv.go:288)] 1Dy_ML5U1CU3_77NQE_4CAIeFiA= 200 5fc5c5-8f01cf-2db3c9 GET /disks/1b298235-a82f-4579-8b7a-e6dd2d9916d3/37cb2e83-eba2-4f52-86cb-722849296034/status (172.16.1.213:23987:compute_v2) 15005.94ms
[error 2024-02-04 01:55:40 storageman.GatherHostStorageStats(core.go:454)] sync storage Ceph-SSD size failed: GetCapacity: output: stderr "": signal: killed
[error 2024-02-04 01:55:55 storageman.GatherHostStorageStats(core.go:454)] sync storage Ceph-HDD size failed: GetCapacity: output: stderr "": signal: killed
[error 2024-02-04 01:57:10 storageman.GatherHostStorageStats(core.go:454)] sync storage Ceph-SSD size failed: GetCapacity: output: stderr "": signal: killed
[info 2024-02-04 01:57:20 appsrv.(*Application).ServeHTTP(appsrv.go:288)] 1Dy_ML5U1CU3_77NQE_4CAIeFiA= 200 180688-8c5614-b45cf9 GET /disks/1b298235-a82f-4579-8b7a-e6dd2d9916d3/6f4b9da3-977c-45a1-8a78-e605d87b8adf/status (172.16.1.213:37811:compute_v2) 15006.26ms
[error 2024-02-04 01:57:25 storageman.GatherHostStorageStats(core.go:454)] sync storage Ceph-HDD size failed: GetCapacity: output: stderr "": signal: killed
[error 2024-02-04 01:58:40 storageman.GatherHostStorageStats(core.go:454)] sync storage Ceph-SSD size failed: GetCapacity: output: stderr "": signal: killed
[error 2024-02-04 01:58:55 storageman.GatherHostStorageStats(core.go:454)] sync storage Ceph-HDD size failed: GetCapacity: output: stderr "": signal: killed
[error 2024-02-04 02:00:10 storageman.GatherHostStorageStats(core.go:454)] sync storage Ceph-SSD size failed: GetCapacity: output: stderr "": signal: killed
[error 2024-02-04 02:00:25 storageman.GatherHostStorageStats(core.go:454)] sync storage Ceph-HDD size failed: GetCapacity: output: stderr "": signal: killed
default-host-deployer 的日志:
[info 2024-02-03 09:02:30 fsutils.MountRootfs(fsutils.go:437)] detect partition /dev/sda
[error 2024-02-03 09:02:30 kvmpart.(*SKVMGuestDiskPartition).Mount(kvmpart.go:115)] Mount fs failed: unsupport fs on /dev/sda
[info 2024-02-03 09:02:30 fsutils.MountRootfs(fsutils.go:437)] detect partition /dev/sda1
[info 2024-02-03 09:02:30 xfsutils.LockXfsPartition(lock.go:24)] xfs lock f956f023-a2a0-4fd5-8e89-0b44b0848ab3
[info 2024-02-03 09:02:30 guestfs.IsPartitionReadonly(core.go:219)] File system /tmp/_dev_sda1 is not readonly
[info 2024-02-03 09:02:30 kvmpart.(*SKVMGuestDiskPartition).Mount(kvmpart.go:139)] mount fs xfs on /dev/sda1 successfully
[info 2024-02-03 09:02:30 fsutils.MountRootfs(fsutils.go:445)] Use rootfs CentosRootFs, partition /dev/sda1
[info 2024-02-03 09:02:30 kvmpart.(*SKVMGuestDiskPartition).Umount(kvmpart.go:296)] umount /dev/sda1: /tmp/_dev_sda1
[info 2024-02-03 09:02:30 kvmpart.(*SKVMGuestDiskPartition).Umount(kvmpart.go:302)] umount /dev/sda1 successfully
[info 2024-02-03 09:02:30 xfsutils.UnlockXfsPartition(lock.go:43)] xfs unlock f956f023-a2a0-4fd5-8e89-0b44b0848ab3
[info 2024-02-03 09:02:30 fsutils.ResizeDiskFs(fsutils.go:142)] Parts: [[1 true 2048 16777215 16775168 primary xfs /dev/sda1]] label: msdos
[info 2024-02-03 09:02:30 fsutils.ResizeDiskFs(fsutils.go:207)] resize disk partition: [parted -a none -s /dev/sda -- resizepart 1 104857599s]
[error 2024-02-03 09:02:31 fsutils.FsckXfsFs(fsutils.go:327)] xfs_check failed: exec: "xfs_check": executable file not found in $PATH, , try xfs_repair -n <dev> instead
[info 2024-02-03 09:02:32 xfsutils.LockXfsPartition(lock.go:24)] xfs lock f956f023-a2a0-4fd5-8e89-0b44b0848ab3
[info 2024-02-03 09:02:38 xfsutils.UnlockXfsPartition(lock.go:43)] xfs unlock f956f023-a2a0-4fd5-8e89-0b44b0848ab3
[info 2024-02-03 09:02:38 monitor.(*HmpMonitor).write(hmp.go:125)] HMP Write : quit
[info 2024-02-03 09:02:38 qemu_kvm.(*QemuBaseDriver).CleanGuest(driver.go:557)] kill process kill: cannot find process "12552
"
exit status 1
[info 2024-02-03 09:02:38 qemu_kvm.(*QemuDeployManager).Release(driver.go:121)] release QemuDeployManager
[info 2024-02-03 09:02:38 monitor.(*HmpMonitor).read(hmp.go:79)] HMP Read : quit
[info 2024-02-03 09:02:38 monitor.(*HmpMonitor).read(hmp.go:91)] Scan over ...
[error 2024-02-03 09:02:38 qemu_kvm.(*QemuX86Driver).StartGuest.func2(driver.go:655)] monitor disconnect %!s(<nil>)
default-host-health的日志:
[error 2024-02-02 09:03:27 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:03:37.447Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[error 2024-02-02 09:03:37 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:03:47.447Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[error 2024-02-02 09:03:47 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:03:57.448Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[error 2024-02-02 09:03:57 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:04:07.449Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[error 2024-02-02 09:04:07 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:04:17.450Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[error 2024-02-02 09:04:17 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:04:27.450Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[error 2024-02-02 09:04:27 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:04:37.451Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[error 2024-02-02 09:04:37 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:04:47.452Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[error 2024-02-02 09:04:47 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:04:57.453Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[error 2024-02-02 09:04:57 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:05:07.454Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[error 2024-02-02 09:05:07 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:05:17.454Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[error 2024-02-02 09:05:17 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:05:27.454Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[error 2024-02-02 09:05:27 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:05:37.455Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[error 2024-02-02 09:05:37 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:05:47.455Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[error 2024-02-02 09:05:47 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:05:57.456Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[error 2024-02-02 09:05:57 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:06:07.456Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[error 2024-02-02 09:06:07 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:06:17.457Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[error 2024-02-02 09:06:17 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:06:27.458Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[error 2024-02-02 09:06:27 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:06:37.460Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[error 2024-02-02 09:06:37 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:06:47.460Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[error 2024-02-02 09:06:47 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:06:57.462Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[error 2024-02-02 09:06:57 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:07:07.464Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[error 2024-02-02 09:07:07 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:07:17.465Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[error 2024-02-02 09:07:17 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:07:27.465Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[error 2024-02-02 09:07:27 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:07:37.466Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[error 2024-02-02 09:07:37 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:07:47.467Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[error 2024-02-02 09:07:47 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:07:57.468Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[error 2024-02-02 09:07:57 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:08:07.469Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[error 2024-02-02 09:08:07 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:08:17.469Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[error 2024-02-02 09:08:17 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:08:27.470Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[error 2024-02-02 09:08:27 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:08:37.470Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[error 2024-02-02 09:08:37 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:08:47.471Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[error 2024-02-02 09:08:47 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:08:57.472Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[error 2024-02-02 09:08:57 host_health.(*SHostHealthManager).Reconnect(health_manager.go:203)] restart session failed context deadline exceeded
{"level":"warn","ts":"2024-02-02T09:08:59.729Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0003dfdc0/#initially=[http://default-etcd-client.onecloud.svc:2379]","attempt":0,"error":"rpc error: code = Unavailable desc = error reading from server: read tcp 10.106.26.5:37338->10.106.26.5:2379: read: connection timed out"}
@chenjacken 看起来是host-agent访问 ceph 失败了,需要确认一下 ceph 集群状态是否正常
@chenjacken 看起来是host-agent访问 ceph 失败了,需要确认一下 ceph 集群状态是否正常
[root@master1 ~]# ceph -s
cluster:
id: e4a15469-543d-4dd4-8367-569d27b1b58f
health: HEALTH_WARN
15 daemons have recently crashed
services:
mon: 3 daemons, quorum i,j,l (age 4h)
mgr: b(active, since 2d), standbys: a
osd: 17 osds: 14 up (since 4h), 14 in (since 16h)
data:
pools: 3 pools, 1153 pgs
objects: 231.53k objects, 894 GiB
usage: 2.6 TiB used, 47 TiB / 50 TiB avail
pgs: 1153 active+clean
io:
client: 2.3 KiB/s rd, 2.6 MiB/s wr, 0 op/s rd, 346 op/s wr
重启pod,再同步磁盘状态,host日志显示:
[error 2024-02-04 02:20:39 httperrors.HTTPError(httperrors.go:110)] Send error Storage 1b298235-a82f-4579-8b7a-e6dd2d9916d3 not found
[info 2024-02-04 02:20:39 appsrv.(*Application).ServeHTTP(appsrv.go:288)] 1Dy_ML5U1CU3_77NQE_4CAIeFiA= 404 0b47ee-d64597-11c67a GET /disks/1b298235-a82f-4579-8b7a-e6dd2d9916d3/6f4b9da3-977c-45a1-8a78-e605d87b8adf/status (172.16.1.213:37705:compute_v2) 363.94ms
[error 2024-02-04 02:21:01 httperrors.HTTPError(httperrors.go:110)] Send error Storage 1b298235-a82f-4579-8b7a-e6dd2d9916d3 not found
[info 2024-02-04 02:21:01 appsrv.(*Application).ServeHTTP(appsrv.go:288)] 1Dy_ML5U1CU3_77NQE_4CAIeFiA= 404 b39dff-d2b699-09bdd8 GET /disks/1b298235-a82f-4579-8b7a-e6dd2d9916d3/37cb2e83-eba2-4f52-86cb-722849296034/status (172.16.1.213:22744:compute_v2) 0.38ms
[error 2024-02-04 02:21:07 httperrors.HTTPError(httperrors.go:110)] Send error Storage 1b298235-a82f-4579-8b7a-e6dd2d9916d3 not found
[info 2024-02-04 02:21:07 appsrv.(*Application).ServeHTTP(appsrv.go:288)] 1Dy_ML5U1CU3_77NQE_4CAIeFiA= 404 0a4523-7b95ef-0b042a GET /disks/1b298235-a82f-4579-8b7a-e6dd2d9916d3/7a4be1cf-3a81-47b4-80e2-02378388d2ba/status (172.16.1.213:40057:compute_v2) 0.19ms
@chenjacken 得先解决一下这个节点访问 ceph的问题,不然这个host 注册不上对应的ceph storage. 看起来 rook-ceph 不稳定,经常会有 pod crash,需要查看日志确认原因。比如可能是内存资源预留不足?或者是网络问题?
@chenjacken 得先解决一下这个节点访问 ceph的问题,不然这个host 注册不上对应的ceph storage. 看起来 rook-ceph 不稳定,经常会有 pod crash,需要查看日志确认原因。比如可能是内存资源预留不足?或者是网络问题?
请教下rook-ceph 如何排查问题,对应怎么看日志?
rook-ceph pod资源预留的配置是:
[root@master1 ~]# kubectl -n rook-ceph get ConfigMap rook-config-override -o yaml
apiVersion: v1
data:
config: |
[global]
public network = 172.16.1.0/24
cluster network = 10.0.1.0/24
public addr = ""
cluster addr = ""
osd pool default size = 3
mon_allow_pool_delete = true
osd_pool_default_pg_num = 32
mon_max_pg_per_osd = 250
mon_osd_full_ratio = 0.95
mon_osd_nearfull_ratio = 0.85
[osd]
osd_recovery_op_priority = 1
osd_recovery_max_active = 1
osd_max_backfills = 1
osd_recovery_max_chunk = 1048576
osd_scrub_begin_hour = 1
osd_scrub_end_hour = 6
kind: ConfigMap
metadata:
annotations:
[root@master1 ~]# ceph crash ls-new
ID ENTITY NEW
2024-02-01T10:43:35.187292Z_33951c20-fb62-4bc5-a510-18931342c728 mon.e *
2024-02-01T10:43:35.271997Z_e8a468b2-9007-4ff0-ad1e-ad7fd300b04f mon.e *
2024-02-02T01:07:13.150854Z_0b1a92c0-e340-4897-982a-10d5e2f12d53 osd.4 *
2024-02-02T01:07:47.914936Z_58d47c59-0e7e-49e2-bac4-b20a56941dec osd.4 *
2024-02-02T14:07:30.602672Z_2461b980-4d81-4a7d-adc8-a9e283c51a8c osd.5 *
2024-02-02T14:08:42.740582Z_c38b4ea3-3de1-4d07-bb67-fffceadec19a osd.5 *
2024-02-03T02:07:15.512683Z_6421a25a-307a-42c0-aa53-0f5502ed38a8 osd.10 *
2024-02-03T02:07:19.785384Z_96a3dd36-b3ed-465c-8ea0-b66bd5c605aa osd.0 *
2024-02-03T02:08:35.567413Z_b01851dd-a776-437f-8340-e4538d3d0a10 osd.0 *
2024-02-03T02:09:30.284489Z_165b520e-ff86-4a2a-bdc0-2245ab861a37 osd.10 *
2024-02-03T07:29:09.737937Z_2e98f19d-783b-4441-9ba8-df4356f65e26 osd.4 *
2024-02-03T07:29:55.762694Z_1cce7885-7564-4460-94a6-a3b9b649ad43 osd.4 *
2024-02-03T07:29:55.762704Z_d6754c56-eb17-42b6-9d9b-e493f8c790a8 osd.9 *
2024-02-03T15:22:16.450396Z_f812bed4-b9c4-4cca-8622-668269b8e155 osd.4 *
2024-02-03T15:22:51.012045Z_e2656828-8c38-4661-99f8-a80e51b2ac22 osd.4 *
[root@master1 ~]#
[root@master1 ~]# ceph crash info 2024-02-03T15:22:51.012045Z_e2656828-8c38-4661-99f8-a80e51b2ac22
{
"assert_condition": "abort",
"assert_file": "/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.14/rpm/el8/BUILD/ceph-16.2.14/src/common/HeartbeatMap.cc",
"assert_func": "bool ceph::HeartbeatMap::_check(const ceph::heartbeat_handle_d*, const char*, ceph::coarse_mono_time)",
"assert_line": 85,
"assert_msg": "/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.14/rpm/el8/BUILD/ceph-16.2.14/src/common/HeartbeatMap.cc: In function 'bool ceph::HeartbeatMap::_check(const ceph::heartbeat_handle_d*, const char*, ceph::coarse_mono_time)' thread 7f04a53ab700 time 2024-02-03T15:21:25.817843+0000\n/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.14/rpm/el8/BUILD/ceph-16.2.14/src/common/HeartbeatMap.cc: 85: ceph_abort_msg(\"hit suicide timeout\")\n",
"assert_thread_name": "tp_osd_tp",
"backtrace": [
"/lib64/libpthread.so.0(+0x12cf0) [0x7f04c5ec0cf0]",
"abort()",
"(ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x1b6) [0x55ac60cc64cb]",
"(ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d const*, char const*, std::chrono::time_point<ceph::coarse_mono_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >)+0x4c9) [0x55ac6144dfe9]",
"(ceph::HeartbeatMap::reset_timeout(ceph::heartbeat_handle_d*, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >)+0x23e) [0x55ac6144e39e]",
"(ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5b0) [0x55ac61472ee0]",
"(ShardedThreadPool::WorkThreadSharded::entry()+0x14) [0x55ac61475dd4]",
"/lib64/libpthread.so.0(+0x81ca) [0x7f04c5eb61ca]",
"clone()"
],
"ceph_version": "16.2.14",
"crash_id": "2024-02-03T15:22:51.012045Z_e2656828-8c38-4661-99f8-a80e51b2ac22",
"entity_name": "osd.4",
"os_id": "centos",
"os_name": "CentOS Stream",
"os_version": "8",
"os_version_id": "8",
"process_name": "ceph-osd",
"stack_sig": "f5c83f1671dffd9e88e869ef7e5a5ba16e742b0333f64f711e0254759c3df114",
"timestamp": "2024-02-03T15:22:51.012045Z",
"utsname_hostname": "node4",
"utsname_machine": "x86_64",
"utsname_release": "5.4.130-1.yn20230805.el7.x86_64",
"utsname_sysname": "Linux",
"utsname_version": "#1 SMP Wed Oct 11 03:26:01 UTC 2023"
}
[root@master1 ~]# ceph crash info 2024-02-01T10:43:35.187292Z_33951c20-fb62-4bc5-a510-18931342c728
{
"backtrace": [
"/lib64/libpthread.so.0(+0x12cf0) [0x7f7cc23cdcf0]",
"gsignal()",
"abort()",
"ceph-mon(+0x775316) [0x55a68f376316]",
"ceph-mon(+0x775432) [0x55a68f376432]",
"(rocksdb::InstrumentedMutex::Lock()+0x9c) [0x55a68f2c8c0c]",
"ceph-mon(+0x59e420) [0x55a68f19f420]",
"(rocksdb::Cleanable::~Cleanable()+0x1c) [0x55a68f32bebc]",
"(rocksdb::DBIter::~DBIter()+0x4da) [0x55a68f20f7ba]",
"(rocksdb::ArenaWrappedDBIter::~ArenaWrappedDBIter()+0x23) [0x55a68f3893c3]",
"(std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release()+0x47) [0x55a68eea1eb7]",
"(std::_Sp_counted_ptr<MonitorDBStore::WholeStoreIteratorImpl*, (__gnu_cxx::_Lock_policy)2>::_M_dispose()+0x5e) [0x55a68eefc77e]",
"(std::_Rb_tree<unsigned long, std::pair<unsigned long const, Monitor::SyncProvider>, std::_Select1st<std::pair<unsigned long const, Monitor::SyncProvider> >, std::less<unsigned long>, std::allocator<std::pair<unsigned long const, Monitor::SyncProvider> > >::_M_erase(std::_Rb_tree_node<std::pair<unsigned long const, Monitor::SyncProvider> >*)+0xc8) [0x55a68ef02b68]",
"(Monitor::~Monitor()+0x39c) [0x55a68eee814c]",
"(Monitor::~Monitor()+0xd) [0x55a68eee8c4d]",
"main()",
"__libc_start_main()",
"_start()"
],
"ceph_version": "16.2.14",
"crash_id": "2024-02-01T10:43:35.187292Z_33951c20-fb62-4bc5-a510-18931342c728",
"entity_name": "mon.e",
"os_id": "centos",
"os_name": "CentOS Stream",
"os_version": "8",
"os_version_id": "8",
"process_name": "ceph-mon",
"stack_sig": "98fe9d7efe083ca88907d5eb1ab86eee77f3abf650476738f28138c3a0b97a5c",
"timestamp": "2024-02-01T10:43:35.187292Z",
"utsname_hostname": "node3",
"utsname_machine": "x86_64",
"utsname_release": "5.4.130-1.yn20230805.el7.x86_64",
"utsname_sysname": "Linux",
"utsname_version": "#1 SMP Wed Oct 11 03:26:01 UTC 2023"
}
[root@master1 ~]#
@chenjacken 一般就是看下 rook-ceph namespace 下 pod 状态和日志,ceph的问题需要自己具体排查一下。 有一个点需要确认的是看下宿主机是否开启了大页?大页会提前分配内存,确认一下大页预留的内存是否够用
@chenjacken 一般就是看下 rook-ceph namespace 下 pod 状态和日志,ceph的问题需要自己具体排查一下。 有一个点需要确认的是看下宿主机是否开启了大页?大页会提前分配内存,确认一下大页预留的内存是否够用
嗯,谢谢。 有开启大页,没特意配置做预留,都是默认的。一般预留多少比较合适?通过什么计算吗?
[root@node4 ~]# free -h
total used free shared buff/cache available
Mem: 62G 57G 887M 3.2G 4.7G 1.2G
Swap: 0B 0B 0B
[root@node4 ~]# cat /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
50
[root@node4 ~]# cat /sys/kernel/mm/hugepages/hugepages-1048576kB/free_hugepages
22
[root@node4 ~]#
@chenjacken 一般控制节点默认不会开启大页,ceph是否运行在控制节点上? 计算节点默认是 20% 内存预留,如果上面没有跑其他特殊服务的话应该不用特殊配置
@chenjacken 一般控制节点默认不会开启大页,ceph是否运行在控制节点上? 计算节点默认是 20% 内存预留,如果上面没有跑其他特殊服务的话应该不用特殊配置
好的,谢谢。rook-ceph的mon没特意指派到控制节点, mon pod会跳动。
@chenjacken 是这个地址,虚机的网络通吗,vpc网络 还是经典网络
虚拟机网络是通的,是经典网络。
@chenjacken 可以在虚机里面访问一下这个地址试试,理论上虚机是通的这个地址应该也没问题。 也可能是 host-agent 当时没启动?metadata server 是在 host-agent 服务中启动的
@chenjacken 可以在虚机里面访问一下这个地址试试,理论上虚机是通的这个地址应该也没问题。 也可能是 host-agent 当时没启动?metadata server 是在 host-agent 服务中启动的
host-agent 是一个pod吗?
node7是虚拟机所在的宿主机。
[root@master1 ~]# kubectl get pods -n onecloud|grep agent
default-esxi-agent-57f79cb476-stsv6 1/1 Running 0 13h
default-lbagent-t847s 2/2 Running 5 8h
default-vpcagent-8557ff4466-cbnmv 1/1 Running 0 13h
[root@master1 ~]# kubectl get pods -n onecloud -owide |grep node7
default-host-deployer-lhrpz 1/1 Running 0 13h 172.16.1.231 node7 <none> <none>
default-host-health-9nz6b 1/1 Running 0 13h 172.16.1.231 node7 <none> <none>
default-host-image-4dbhr 1/1 Running 0 12h 172.16.1.231 node7 <none> <none>
default-host-ngvsl 3/3 Running 0 13h 172.16.1.231 node7 <none> <none>
default-telegraf-5ctkn 1/1 Running 0 14h 172.16.1.231 node7 <none> <none>
[root@master1 ~]#
@chenjacken 就是 default-host pod ,http 不通的话看起来是有问题,ping 是不通的。
可以看下 host 日志中是否有对应的请求日志。
还可以使用 ovs-ofctl dump-flows br0 | grep 169.254
看下有没有对应的流表。如果没有就需要看下 sdnagent 日志排查一下
感觉是那里出错了吗?
[root@node9 ~]# ovs-ofctl dump-flows br0 | grep 169.254
cookie=0x0, duration=104795.600s, table=0, n_packets=0, n_bytes=0, idle_age=65534, hard_age=65534, priority=29310,tcp,in_port=LOCAL,nw_dst=169.254.169.254,tp_dst=80 actions=NORMAL
default-host里的sdnagent日志:
[info 2024-02-05 15:31:00 server.(*FlowMan).doCheck(flowman.go:92)] flowman brmapped: check done
[info 2024-02-05 15:31:00 server.(*FlowMan).doCheck(flowman.go:130)] flowman br0: check done
[info 2024-02-05 15:31:00 server.(*FlowMan).Start(flowman.go:232)] flowman br1: do idle check
[info 2024-02-05 15:31:00 server.(*FlowMan).doCheck(flowman.go:130)] flowman br1: check done
[info 2024-02-05 15:31:02 server.(*ovnMdMan).cleanup(ovn-md.go:1006)] ovnMd: clean done
[info 2024-02-05 15:31:05 server.(*FlowMan).Start(flowman.go:232)] flowman brtap: do idle check
[info 2024-02-05 15:31:05 server.(*FlowMan).doCheck(flowman.go:130)] flowman brtap: check done
[info 2024-02-05 15:31:11 options.parseOptions(options.go:331)] Use configuration file: /etc/yunion/host.conf
[info 2024-02-05 15:31:11 options.parseOptions(options.go:354)] Set log level to "info"
[info 2024-02-05 15:31:11 options.parseOptions(options.go:331)] Use configuration file: /etc/yunion/common/common.conf
[info 2024-02-05 15:31:11 options.parseOptions(options.go:354)] Set log level to "info"
[info 2024-02-05 15:31:13 server.(*TcMan).doIdleCheck(tcman.go:155)] tcman: doing idle check
[info 2024-02-05 15:31:13 server.(*TcMan).doCheckPage(tcman.go:177)] skip dhcp1-196: it uses mq
[info 2024-02-05 15:31:13 server.(*TcMan).doCheckPage(tcman.go:177)] skip dhcp1-190: it uses mq
[info 2024-02-05 15:31:13 server.(*TcMan).doIdleCheck(tcman.go:160)] tcman: done idle check
[info 2024-02-05 15:31:13 server.(*FlowMan).Start(flowman.go:232)] flowman brmapped: do idle check
[error 2024-02-05 15:31:13 server.(*FlowMan).doDumpFlows(flowman.go:72)] flowman brmapped: dump-flows failed: exit status 1: ovs-ofctl: brmapped is not a bridge or a socket
[error 2024-02-05 15:31:13 server.(*FlowMan).doCheck(flowman.go:91)] FlowMan doCheck doDumpFlows fail DumpFlows: exit status 1: ovs-ofctl: brmapped is not a bridge or a socket
[info 2024-02-05 15:31:13 server.(*FlowMan).doCheck(flowman.go:92)] flowman brmapped: check done
另外,镜像处理感觉容易出现故障,会存在等待保存
一直不动,除了上面fs问题
(等待修复结果🌹),是否我这环境里的minio存在了问题。?
API: BackgroundHeal()
Time: 14:54:51 UTC 02/05/2024
DeploymentID: dfa47b2d-e946-4087-8f19-f4cdd587965e
Error: Heal attempt failed for .minio.sys/buckets/.usage.json: Storage resources are insufficient for the read operation .minio.sys/buckets/.usage.json (*fmt.wrapError)
2: cmd/admin-heal-ops.go:807:cmd.(*healSequence).healItemsFromSourceCh()
1: cmd/admin-heal-ops.go:818:cmd.(*healSequence).healFromSourceCh()
具体的日志:
[root@master1 ~]# kubectl get pods -n onecloud-minio -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
minio-0 1/1 Running 0 41m 10.40.136.61 master3 <none> <none>
minio-1 1/1 Running 0 12h 10.40.137.100 master1 <none> <none>
minio-2 1/1 Running 0 27h 10.40.180.26 master2 <none> <none>
minio-3 1/1 Running 0 12h 10.40.137.69 master1 <none> <none>
[root@master1 ~]# kubectl -n onecloud-minio logs minio-0
API: SYSTEM()
Time: 14:52:38 UTC 02/05/2024
Error: lookup minio-1.minio-svc.onecloud-minio.svc.cluster.local on 10.96.0.10:53: read udp 10.40.136.61:40916->10.96.0.10:53: i/o timeout (*net.DNSError)
host=minio-1.minio-svc.onecloud-minio.svc.cluster.local, elapsedTime=50 seconds elapsed
5: cmd/endpoint.go:483:cmd.Endpoints.UpdateIsLocal()
4: cmd/endpoint.go:667:cmd.CreateEndpoints()
3: cmd/endpoint-ellipses.go:373:cmd.createServerEndpoints()
2: cmd/server-main.go:145:cmd.serverHandleCmdArgs()
1: cmd/server-main.go:440:cmd.serverMain()
API: SYSTEM()
Time: 14:53:29 UTC 02/05/2024
Error: lookup minio-2.minio-svc.onecloud-minio.svc.cluster.local on 10.96.0.10:53: read udp 10.40.136.61:52327->10.96.0.10:53: i/o timeout (*net.DNSError)
elapsedTime=1 minute elapsed, host=minio-2.minio-svc.onecloud-minio.svc.cluster.local
5: cmd/endpoint.go:483:cmd.Endpoints.UpdateIsLocal()
4: cmd/endpoint.go:667:cmd.CreateEndpoints()
3: cmd/endpoint-ellipses.go:373:cmd.createServerEndpoints()
2: cmd/server-main.go:145:cmd.serverHandleCmdArgs()
1: cmd/server-main.go:440:cmd.serverMain()
API: SYSTEM()
Time: 14:54:19 UTC 02/05/2024
Error: lookup minio-3.minio-svc.onecloud-minio.svc.cluster.local on 10.96.0.10:53: read udp 10.40.136.61:33450->10.96.0.10:53: i/o timeout (*net.DNSError)
host=minio-3.minio-svc.onecloud-minio.svc.cluster.local, elapsedTime=2 minutes elapsed
5: cmd/endpoint.go:483:cmd.Endpoints.UpdateIsLocal()
4: cmd/endpoint.go:667:cmd.CreateEndpoints()
3: cmd/endpoint-ellipses.go:373:cmd.createServerEndpoints()
2: cmd/server-main.go:145:cmd.serverHandleCmdArgs()
1: cmd/server-main.go:440:cmd.serverMain()
WARNING: MINIO_ACCESS_KEY and MINIO_SECRET_KEY are deprecated.
Please use MINIO_ROOT_USER and MINIO_ROOT_PASSWORD
You are running an older version of MinIO released 2 years ago
Update: Run `mc admin update`
Waiting for all MinIO sub-systems to be initialized.. lock acquired
Verifying if 1 bucket is consistent across drives...
All MinIO sub-systems initialized successfully
Waiting for all MinIO IAM sub-system to be initialized.. lock acquired
IAM initialization complete
Status: 4 Online, 0 Offline.
Endpoint: http://10.40.136.61:9000 http://127.0.0.1:9000
Browser Access:
http://10.40.136.61:9000 http://127.0.0.1:9000
Object API (Amazon S3 compatible):
Go: https://docs.min.io/docs/golang-client-quickstart-guide
Java: https://docs.min.io/docs/java-client-quickstart-guide
Python: https://docs.min.io/docs/python-client-quickstart-guide
JavaScript: https://docs.min.io/docs/javascript-client-quickstart-guide
.NET: https://docs.min.io/docs/dotnet-client-quickstart-guide
API: BackgroundHeal()
Time: 14:54:51 UTC 02/05/2024
DeploymentID: dfa47b2d-e946-4087-8f19-f4cdd587965e
Error: Heal attempt failed for .minio.sys/buckets/.usage.json: Storage resources are insufficient for the read operation .minio.sys/buckets/.usage.json (*fmt.wrapError)
2: cmd/admin-heal-ops.go:807:cmd.(*healSequence).healItemsFromSourceCh()
1: cmd/admin-heal-ops.go:818:cmd.(*healSequence).healFromSourceCh()
[root@master1 ~]#
[root@master1 ~]# kubectl -n onecloud-minio logs minio-1
API: SYSTEM()
Time: 03:18:23 UTC 02/05/2024
Error: lookup minio-0.minio-svc.onecloud-minio.svc.cluster.local on 10.96.0.10:53: read udp 10.40.137.100:59775->10.96.0.10:53: i/o timeout (*net.DNSError)
host=minio-0.minio-svc.onecloud-minio.svc.cluster.local, elapsedTime=50 seconds elapsed
5: cmd/endpoint.go:483:cmd.Endpoints.UpdateIsLocal()
4: cmd/endpoint.go:667:cmd.CreateEndpoints()
3: cmd/endpoint-ellipses.go:373:cmd.createServerEndpoints()
2: cmd/server-main.go:145:cmd.serverHandleCmdArgs()
1: cmd/server-main.go:440:cmd.serverMain()
API: SYSTEM()
Time: 03:19:13 UTC 02/05/2024
Error: lookup minio-2.minio-svc.onecloud-minio.svc.cluster.local on 10.96.0.10:53: read udp 10.40.137.100:54355->10.96.0.10:53: i/o timeout (*net.DNSError)
host=minio-2.minio-svc.onecloud-minio.svc.cluster.local, elapsedTime=1 minute elapsed
5: cmd/endpoint.go:483:cmd.Endpoints.UpdateIsLocal()
4: cmd/endpoint.go:667:cmd.CreateEndpoints()
3: cmd/endpoint-ellipses.go:373:cmd.createServerEndpoints()
2: cmd/server-main.go:145:cmd.serverHandleCmdArgs()
1: cmd/server-main.go:440:cmd.serverMain()
API: SYSTEM()
Time: 03:20:03 UTC 02/05/2024
Error: lookup minio-3.minio-svc.onecloud-minio.svc.cluster.local on 10.96.0.10:53: read udp 10.40.137.100:40491->10.96.0.10:53: i/o timeout (*net.DNSError)
host=minio-3.minio-svc.onecloud-minio.svc.cluster.local, elapsedTime=2 minutes elapsed
5: cmd/endpoint.go:483:cmd.Endpoints.UpdateIsLocal()
4: cmd/endpoint.go:667:cmd.CreateEndpoints()
3: cmd/endpoint-ellipses.go:373:cmd.createServerEndpoints()
2: cmd/server-main.go:145:cmd.serverHandleCmdArgs()
1: cmd/server-main.go:440:cmd.serverMain()
WARNING: MINIO_ACCESS_KEY and MINIO_SECRET_KEY are deprecated.
Please use MINIO_ROOT_USER and MINIO_ROOT_PASSWORD
You are running an older version of MinIO released 2 years ago
Update: Run `mc admin update`
Waiting for all MinIO sub-systems to be initialized.. lock acquired
Verifying if 1 bucket is consistent across drives...
All MinIO sub-systems initialized successfully
Waiting for all MinIO IAM sub-system to be initialized.. lock acquired
IAM initialization complete
Status: 4 Online, 0 Offline.
Endpoint: http://10.40.137.100:9000 http://127.0.0.1:9000
Browser Access:
http://10.40.137.100:9000 http://127.0.0.1:9000
Object API (Amazon S3 compatible):
Go: https://docs.min.io/docs/golang-client-quickstart-guide
Java: https://docs.min.io/docs/java-client-quickstart-guide
Python: https://docs.min.io/docs/python-client-quickstart-guide
JavaScript: https://docs.min.io/docs/javascript-client-quickstart-guide
.NET: https://docs.min.io/docs/dotnet-client-quickstart-guide
API: SYSTEM()
Time: 14:45:59 UTC 02/05/2024
DeploymentID: dfa47b2d-e946-4087-8f19-f4cdd587965e
Error: Marking http://minio-0.minio-svc.onecloud-minio.svc.cluster.local:9000/minio/lock/v6 temporary offline; caused by Post "http://minio-0.minio-svc.onecloud-minio.svc.cluster.local:9000/minio/lock/v6/lock?owner=minio-1.minio-svc.onecloud-minio.svc.cluster.local%3A9000&quorum=3&source=%5Bdata-scanner.go%3A103%3ArunDataScanner%28%29%5D&uid=e3efc0db-4493-46b1-b867-6bfb566336d0": dial tcp 10.40.136.11:9000: connect: connection refused (*fmt.wrapError)
5: internal/rest/client.go:147:rest.(*Client).Call()
4: cmd/lock-rest-client.go:66:cmd.(*lockRESTClient).callWithContext()
3: cmd/lock-rest-client.go:102:cmd.(*lockRESTClient).restCall()
2: cmd/lock-rest-client.go:121:cmd.(*lockRESTClient).Lock()
1: internal/dsync/drwmutex.go:393:dsync.lock.func1()
API: SYSTEM()
Time: 14:46:12 UTC 02/05/2024
DeploymentID: dfa47b2d-e946-4087-8f19-f4cdd587965e
Error: Marking http://minio-0.minio-svc.onecloud-minio.svc.cluster.local:9000/minio/storage/export/v37 temporary offline; caused by Post "http://minio-0.minio-svc.onecloud-minio.svc.cluster.local:9000/minio/storage/export/v37/listvols?disk-id=b742c870-5f02-40ce-9ea4-6dd270baa0da": dial tcp 10.40.136.11:9000: i/o timeout (*fmt.wrapError)
5: internal/rest/client.go:147:rest.(*Client).Call()
4: cmd/storage-rest-client.go:151:cmd.(*storageRESTClient).call()
3: cmd/storage-rest-client.go:325:cmd.(*storageRESTClient).ListVols()
2: cmd/erasure-healing.go:182:cmd.listAllBuckets.func1()
1: internal/sync/errgroup/errgroup.go:123:errgroup.(*Group).Go.func1()
API: SYSTEM()
Time: 14:46:22 UTC 02/05/2024
DeploymentID: dfa47b2d-e946-4087-8f19-f4cdd587965e
Error: lookup minio-0.minio-svc.onecloud-minio.svc.cluster.local on 10.96.0.10:53: dial udp 10.96.0.10:53: i/o timeout (*net.DNSError)
4: internal/logger/logonce.go:54:logger.(*logOnceType).logOnceIf()
3: internal/logger/logonce.go:94:logger.LogOnceIf()
2: internal/http/dial_dnscache.go:188:http.(*DNSCache).Refresh()
1: internal/http/dial_dnscache.go:139:http.NewDNSCache.func1()
API: SYSTEM()
Time: 14:46:26 UTC 02/05/2024
DeploymentID: dfa47b2d-e946-4087-8f19-f4cdd587965e
Error: Disk: http://minio-0.minio-svc.onecloud-minio.svc.cluster.local:9000/export returned disk not found (*fmt.wrapError)
endpoint=http://minio-0.minio-svc.onecloud-minio.svc.cluster.local:9000/export
2: cmd/prepare-storage.go:50:cmd.glob..func7.1()
1: cmd/erasure-sets.go:228:cmd.(*erasureSets).connectDisks.func2()
API: SYSTEM()
Time: 14:46:27 UTC 02/05/2024
DeploymentID: dfa47b2d-e946-4087-8f19-f4cdd587965e
Error: lookup minio-2.minio-svc.onecloud-minio.svc.cluster.local on 10.96.0.10:53: dial udp 10.96.0.10:53: i/o timeout (*net.DNSError)
4: internal/logger/logonce.go:54:logger.(*logOnceType).logOnceIf()
3: internal/logger/logonce.go:94:logger.LogOnceIf()
2: internal/http/dial_dnscache.go:188:http.(*DNSCache).Refresh()
1: internal/http/dial_dnscache.go:139:http.NewDNSCache.func1()
API: SYSTEM()
Time: 14:51:17 UTC 02/05/2024
DeploymentID: dfa47b2d-e946-4087-8f19-f4cdd587965e
Error: lookup minio-0.minio-svc.onecloud-minio.svc.cluster.local on 10.96.0.10:53: no such host (*net.DNSError)
4: internal/logger/logonce.go:54:logger.(*logOnceType).logOnceIf()
3: internal/logger/logonce.go:94:logger.LogOnceIf()
2: internal/http/dial_dnscache.go:188:http.(*DNSCache).Refresh()
1: internal/http/dial_dnscache.go:139:http.NewDNSCache.func1()
Client http://minio-0.minio-svc.onecloud-minio.svc.cluster.local:9000/minio/storage/export/v37 online
Client http://minio-0.minio-svc.onecloud-minio.svc.cluster.local:9000/minio/lock/v6 online
[root@master1 ~]#
另外,镜像处理感觉容易出现故障,会存在
等待保存
一直不动,除了上面fs问题
(等待修复结果🌹),是否我这环境里的minio存在了问题。?
xfs 问题已经修复了,https://github.com/yunionio/cloudpods/pull/19456 你可以更新一下host-deployer 镜像 registry.cn-hangzhou.aliyuncs.com/d3lx/host-deployer:20240205-210119
镜像是否保存失败需要先看下 glance 服务的日志
另外,镜像处理感觉容易出现故障,会存在
等待保存
一直不动,除了上面fs问题
(等待修复结果🌹),是否我这环境里的minio存在了问题。?xfs 问题已经修复了,#19456 你可以更新一下host-deployer 镜像 registry.cn-hangzhou.aliyuncs.com/d3lx/host-deployer:20240205-210119
镜像是否保存失败需要先看下 glance 服务的日志
好的,谢谢,点赞🌹
minio-0看日志貌似是有问题,帮忙看看如何排查
[root@master1 ~]# kubectl -n onecloud-minio logs minio-0
Browser Access:
http://10.40.136.61:9000 http://127.0.0.1:9000
Object API (Amazon S3 compatible):
Go: https://docs.min.io/docs/golang-client-quickstart-guide
Java: https://docs.min.io/docs/java-client-quickstart-guide
Python: https://docs.min.io/docs/python-client-quickstart-guide
JavaScript: https://docs.min.io/docs/javascript-client-quickstart-guide
.NET: https://docs.min.io/docs/dotnet-client-quickstart-guide
API: BackgroundHeal()
Time: 14:54:51 UTC 02/05/2024
DeploymentID: dfa47b2d-e946-4087-8f19-f4cdd587965e
Error: Heal attempt failed for .minio.sys/buckets/.usage.json: Storage resources are insufficient for the read operation .minio.sys/buckets/.usage.json (*fmt.wrapError)
2: cmd/admin-heal-ops.go:807:cmd.(*healSequence).healItemsFromSourceCh()
1: cmd/admin-heal-ops.go:818:cmd.(*healSequence).healFromSourceCh()
[root@master1 ~]#
Error: Heal attempt failed for .minio.sys/buckets/.usage.json: Storage resources are insufficient for the read operation .minio.sys/buckets/.usage.json (*fmt.wrapError)
@chenjacken 看这个报错应该就是minio 使用的磁盘 io 太慢了,但是应该不影响使用。你先看 glance 日志找找有没有报错
Error: Heal attempt failed for .minio.sys/buckets/.usage.json: Storage resources are insufficient for the read operation .minio.sys/buckets/.usage.json (*fmt.wrapError)
@chenjacken 看这个报错应该就是minio 使用的磁盘 io 太慢了,但是应该不影响使用。你先看 glance 日志找找有没有报错
好的,谢谢。那个有问题(一直显示等待)的镜像我已经删除了,后续再留意情况。感谢感谢!🌹
另外,镜像处理感觉容易出现故障,会存在
等待保存
一直不动,除了上面fs问题
(等待修复结果🌹),是否我这环境里的minio存在了问题。?xfs 问题已经修复了,#19456 你可以更新一下host-deployer 镜像 registry.cn-hangzhou.aliyuncs.com/d3lx/host-deployer:20240205-210119
镜像是否保存失败需要先看下 glance 服务的日志
升级了,新建虚拟机时候(镜像是Windows-2016),出现”磁盘分配失败“,具体的web日志内容如下:
{
"__reason__": {
"reason": {
"__reason__": {
"reason": "{\"__reason__\":{\"reason\":\"{\\\"__reason__\\\":{\\\"reason\\\":{\\\"image_id\\\":\\\"809470c0-0e38-49b0-8e3f-fb522f2b6598\\\",\\\"reason\\\":{\\\"__reason__\\\":\\\"AcquireImage: convert loca image 809470c0-0e38-49b0-8e3f-fb522f2b6598 to rbd pool hddpool at host : exit status 1\\\",\\\"__stage__\\\":\\\"OnImageCacheComplete\\\",\\\"__status__\\\":\\\"error\\\"}},\\\"stage\\\":\\\"OnImageCacheComplete\\\"},\\\"__stage__\\\":\\\"OnStorageCacheImageComplete\\\",\\\"__status__\\\":\\\"error\\\",\\\"__task_name__\\\":\\\"StorageCacheImageTask\\\"}\",\"stage\":\"OnStorageCacheImageComplete\"},\"__stage__\":\"on_kvm_disk_prepared\",\"__status__\":\"error\",\"__task_name__\":\"DiskCreateTask\"}",
"stage": "on_kvm_disk_prepared"
},
"__stage__": "on_disk_prepared",
"__status__": "error",
"__task_name__": "KVMGuestCreateDiskTask"
},
"stage": "on_disk_prepared"
},
"__stage__": "OnDiskPrepared",
"__status__": "error",
"__task_name__": "GuestCreateDiskTask"
}
另外一个虚拟机的报错信息:
{
"__reason__": {
"reason": {
"__reason__": {
"reason": "{\"__reason__\":{\"reason\":{\"__reason__\":\"cloneImage(image_cache_809470c0-0e38-49b0-8e3f-fb522f2b6598): findOrCreateSnap: CreateSnapshot: snap create: rbd 2024-02-06T04:58:33.279+0000 7f98537fe700 -1 librbd::SnapshotCreateRequest: failed to allocate snapshot id: (110) Connection timed outrbd: failed to create snapshot: \\n(110) Connection timed out\\n: exit status 110\",\"__stage__\":\"OnDiskReady\",\"__status__\":\"error\"},\"stage\":\"OnDiskReady\"},\"__stage__\":\"on_kvm_disk_prepared\",\"__status__\":\"error\",\"__task_name__\":\"DiskCreateTask\"}",
"stage": "on_kvm_disk_prepared"
},
"__stage__": "on_disk_prepared",
"__status__": "error",
"__task_name__": "KVMGuestCreateDiskTask"
},
"stage": "on_disk_prepared"
},
"__stage__": "OnDiskPrepared",
"__status__": "error",
"__task_name__": "GuestCreateDiskTask"
}
1,版本 高可用v3.10.11版本
host.conf的网络配置是:
Ceph的
configmap rook-config-override
的网络内容:2,上传镜像并创新虚拟机的问题
1)上传镜像(大概是30G左右)感觉特别慢 2)上传镜像期间,web访问明显很慢,读取数据显示加载中 3)创建对应的虚拟机,显示“部署失败”,然后同步状态,虚拟机显示“运行中”
Web端的"更新状态失败"日志是:
Web端的"部署失败"日志是: