Open JiezengDev opened 6 months ago
Agent
集群2的agent正常连接上部署于集群1的server
No response
agent version
Defaulted container "deepflow-agent" out of: deepflow-agent, configure-sysctl (init) 9760-b3f2758a3ad8a8b06ff221251a3ef1c80abbc6ff Name: deepflow-agent community edition Branch: v6.4 CommitId: b3f2758a3ad8a8b06ff221251a3ef1c80abbc6ff RevCount: 9760 Compiler: rustc 1.75.0 (82e1608df 2023-12-21) CompileTime: 2024-03-28 03:09:51
server version
2024/04/12 14:11:38 ENV K8S_NODE_NAME_FOR_DEEPFLOW=ucpcompute34-test-py-cloudvsp; K8S_NODE_IP_FOR_DEEPFLOW=10.218.2.43; K8S_POD_NAME_FOR_DEEPFLOW=deepflow-server-57f45b5df7-zdr7c; K8S_POD_IP_FOR_DEEPFLOW=10.218.44.115; K8S_NAMESPACE_FOR_DEEPFLOW=deepflow Name: deepflow-server community edition Branch: v6.4 CommitID: 2f352404119929699874cc3fce8d4b180222db09 RevCount: 9755 Compiler: go version go1.20.14 linux/amd64 CompileTime: 2024-03-27 07:31:49
出现异常的agent都位于集群2,通过集群1的NodePort(32284)连接server 20035端口
cni:v3.25.1
CentOS Stream release 8 4.18.0-500.el8.x86_64
集群1中server的svc以及NodePort(32284)信息:
deepflow-server NodePort 192.168.23.110 10.218.48.22 20416:32743/TCP,20419:32758/TCP,20417:30417/TCP,20035:32284/TCP,30035:30036/TCP,20135:30764/TCP,20033:31161/TCP,30033:30033/TCP 176d
集群2的agent配置
NodePort(32284)联通性测试
agent日志
[2024-04-12 14:31:11.512233 +08:00] INFO [src/config/handler.rs:2680] trident_type change from TtUnknown to TtVmPod [2024-04-12 14:31:11.512260 +08:00] INFO [src/trident.rs:1543] platform monitoring no extra netns [2024-04-12 14:31:11.512397 +08:00] INFO [src/sender/uniform_sender.rs:242] stats uniform sender id: 1 started [2024-04-12 14:31:11.512416 +08:00] INFO [src/trident.rs:1559] Start check process... [2024-04-12 14:31:11.518510 +08:00] INFO [src/platform/platform_synchronizer/linux.rs:562] Platform information changed to version 1712903472 [2024-04-12 14:31:11.518598 +08:00] INFO [src/platform/platform_synchronizer/linux.rs:826] local version changed to 1712903472 [2024-04-12 14:31:11.518679 +08:00] INFO [src/rpc/session.rs:622] rpc IP changed to proxy 10.218.2.43 30035 from controller 10.218.2.43 30036 [2024-04-12 14:31:11.520143 +08:00] ERROR [src/platform/platform_synchronizer/linux.rs:847] send platform information with genesis_sync grpc call failed: status: Unimplemented, message: "unknown service trident.Synchronizer", details: [], metadata: MetadataMap { headers: {"content-type": "application/grpc"} } [2024-04-12 14:31:11.542116 +08:00] INFO [src/trident.rs:1563] Start check core file... [2024-04-12 14:31:11.542227 +08:00] WARN [src/utils/environment.rs:282] The core file is configured with pipeline operation, failed to check. [2024-04-12 14:31:11.542249 +08:00] INFO [src/trident.rs:1566] Start check controller ip... [2024-04-12 14:31:11.542262 +08:00] INFO [src/trident.rs:1568] Start check free space... [2024-04-12 14:31:11.542756 +08:00] INFO [src/trident.rs:1592] Agent run with feature-flags: NONE. [2024-04-12 14:31:11.546995 +08:00] INFO [src/rpc/synchronizer.rs:507] Reset version of acls, groups and platform_data. [2024-04-12 14:31:11.547063 +08:00] INFO [src/platform/kubernetes/mod.rs:80] kubernetes poller privileges: set_ns=true read_link_ns=true [2024-04-12 14:31:11.547078 +08:00] INFO [src/platform/kubernetes/mod.rs:90] platform monitoring no extra netns [2024-04-12 14:31:11.547489 +08:00] INFO [src/trident.rs:1699] static analyzer ip: '' actual analyzer ip '10.218.2.43' [2024-04-12 14:31:11.550291 +08:00] INFO [src/dispatcher/mod.rs:1224] Afpacket init with Options { frame_size: 65536, block_size: 1048576, num_blocks: 128, add_vlan_header: false, block_timeout: 64000000, poll_timeout: 100000000, version: TpacketVersionHighestavailablet, socket_type: SocketTypeRaw, iface: "" } [2024-04-12 14:31:11.593293 +08:00] INFO [src/dispatcher/base_dispatcher.rs:655] Decap tunnel type change to VXLAN IPIP [2024-04-12 14:31:11.593469 +08:00] INFO [src/dispatcher/base_dispatcher.rs:723] Npb dedup change to true [2024-04-12 14:31:11.593576 +08:00] INFO [src/dispatcher/base_dispatcher.rs:788] Dispatcher(0) Adding VMs: [00:00:00:00:00:00, 6c:92:bf:ca:26:0a, 6c:92:bf:ca:26:0b, 6c:92:bf:cc:40:2d, 6c:92:bf:cc:40:2e, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee, ee:ee:ee:ee:ee:ee] [2024-04-12 14:31:11.593705 +08:00] INFO [src/rpc/synchronizer.rs:507] Reset version of acls, groups and platform_data.
agent的PodIP:10.218.52.233|10.218.52.89|10.218.52.60|10.218.52.147
无法在server日志中查询到agent的PodIP
agent所在的nodeIP :10.218.2.49|10.218.2.39|10.218.2.48|10.218.2.38
可以在server日志中查询到集群2相关的NodeIP server.log
Search before asking
DeepFlow Component
Agent
What you expected to happen
集群2的agent正常连接上部署于集群1的server
How to reproduce
No response
DeepFlow version
agent version
server version
DeepFlow agent list
出现异常的agent都位于集群2,通过集群1的NodePort(32284)连接server 20035端口
Kubernetes CNI
cni:v3.25.1
Operation-System/Kernel version
Anything else
集群1中server的svc以及NodePort(32284)信息:
集群2的agent配置
NodePort(32284)联通性测试
agent日志
Are you willing to submit a PR?
Code of Conduct