kubeovn / kube-ovn

A Bridge between SDN and Cloud Native (Project under CNCF)
https://kubeovn.github.io/docs/stable/en/
Apache License 2.0
1.89k stars 433 forks source link

[BUG] vpc网关pod删除后,pod使用的弹性外网ip未释放 #4200

Open author970 opened 1 month ago

author970 commented 1 month ago

Kube-OVN Version

v1.12.11

Kubernetes Version

v1.27.6

Operation-system/Kernel Version

Linux node152 5.10.0-136.12.0.86.4.hl202.x86_64 #1 SMP Fri Mar 10 14:42:11 CST 2023 x86_64 x86_64 x86_64 GNU/Linux

Description

1、创建vpc-nat-vpc资源,自动创建statefulSet及Pod,通过命令kubectl get ip -A 可查看到一个vpc subnet ip(即LanIP)和一个macvlan类型(弹性外网)subnet的子网的ip被使用,其中弹性外网ip没有mac地址; 2、重启vpc-nat-gw资源对应的Pod,再通过命令kubectl get ip -A可查看到新的macvlan类型(弹性外网)subnet的子网的ip被使用,查询弹性外网subnet资源status中可用ip资源,发现1中的弹性外网IP,并未变为可用。 企业微信截图_17188785695591 企业微信截图_17188786707688 企业微信截图_17188787445365

Steps To Reproduce

1、创建vpc-nat-vpc资源,自动创建statefulSet及Pod,通过命令kubectl get ip -A 可查看到一个vpc subnet ip(即LanIP)和一个macvlan类型(弹性外网)subnet的子网的ip被使用,其中弹性外网ip没有mac地址; 2、重启vpc-nat-gw资源对应的Pod,再通过命令kubectl get ip -A可查看到新的macvlan类型(弹性外网)subnet的子网的ip被使用,查询弹性外网subnet资源status中可用ip资源,发现1中的弹性外网IP,并未变为可用。

Current Behavior

vpc网关重启后,未释放弹性外网ip

Expected Behavior

vpc网关重启后,可正常释放弹性外网ip。

jcshare commented 1 month ago

v1.12.17 上也有这问题, 创建vpc-gw时会连续分配多个IP,但后期没回收:

  701 I0619 12:07:04.058569       6 ipam.go:60] allocate v4 192.168.1.10, v6 , mac  for kube-system/vpc-nat-gw-gw1-vpc-1-0 from subnet ovn-vpc-external-network
  702 I0619 12:07:04.071551       6 ipam.go:72] allocating static ip 10.0.1.254 from subnet net1-vpc-1
  703 E0619 12:07:04.072121       6 pod.go:1762] failed to get static ip 10.0.1.254, mac <nil>, subnet net1-vpc-1, err NoAvailableAddress
  704 I0619 12:07:04.072830       6 ipam.go:72] allocating static ip 10.0.1.254 from subnet ovn-default
  705 E0619 12:07:04.073320       6 ipam.go:89] failed to allocate static ip 10.0.1.254 for kube-system/vpc-nat-gw-gw1-vpc-1-0
  706 E0619 12:07:04.073525       6 pod.go:1762] failed to get static ip 10.0.1.254, mac <nil>, subnet ovn-default, err AddressOutOfRange
  707 E0619 12:07:04.073788       6 pod.go:620] AddressOutOfRange
  708 E0619 12:07:04.074250       6 pod.go:405] error syncing 'kube-system/vpc-nat-gw-gw1-vpc-1-0': AddressOutOfRange, requeuing
  709 I0619 12:07:04.074177       6 event.go:298] Event(v1.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"vpc-nat-gw-gw1-vpc-1-0", UID:"d61e58b5-c8f8-4f79-86cc-4e6d8724f475", AP      IVersion:"v1", ResourceVersion:"2632", FieldPath:""}): type: 'Warning' reason: 'AcquireAddressFailed' AddressOutOfRange
  710 I0619 12:07:04.080417       6 pod.go:550] handle add/update pod kube-system/vpc-nat-gw-gw1-vpc-1-0
  711 I0619 12:07:04.083914       6 pod.go:346] enqueue update pod kube-system/vpc-nat-gw-gw1-vpc-1-0
  712 I0619 12:07:04.086506       6 pod.go:607] sync pod kube-system/vpc-nat-gw-gw1-vpc-1-0 allocated
  713 I0619 12:07:04.087707       6 ipam.go:60] allocate v4 192.168.1.11, v6 , mac  for kube-system/vpc-nat-gw-gw1-vpc-1-0 from subnet ovn-vpc-external-network
  714 I0619 12:07:04.097194       6 ipam.go:72] allocating static ip 10.0.1.254 from subnet net1-vpc-1
  715 E0619 12:07:04.097443       6 pod.go:1762] failed to get static ip 10.0.1.254, mac <nil>, subnet net1-vpc-1, err NoAvailableAddress
  716 I0619 12:07:04.097556       6 ipam.go:72] allocating static ip 10.0.1.254 from subnet ovn-default
  717 E0619 12:07:04.097627       6 ipam.go:89] failed to allocate static ip 10.0.1.254 for kube-system/vpc-nat-gw-gw1-vpc-1-0
  718 E0619 12:07:04.097700       6 pod.go:1762] failed to get static ip 10.0.1.254, mac <nil>, subnet ovn-default, err AddressOutOfRange
  719 E0619 12:07:04.097851       6 pod.go:620] AddressOutOfRange
  720 E0619 12:07:04.097949       6 pod.go:405] error syncing 'kube-system/vpc-nat-gw-gw1-vpc-1-0': AddressOutOfRange, requeuing
  721 I0619 12:07:04.098040       6 pod.go:550] handle add/update pod kube-system/vpc-nat-gw-gw1-vpc-1-0
  722 I0619 12:07:04.098105       6 event.go:298] Event(v1.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"vpc-nat-gw-gw1-vpc-1-0", UID:"d61e58b5-c8f8-4f79-86cc-4e6d8724f475", AP      IVersion:"v1", ResourceVersion:"2633", FieldPath:""}): type: 'Warning' reason: 'AcquireAddressFailed' AddressOutOfRange
  723 I0619 12:07:04.101424       6 pod.go:607] sync pod kube-system/vpc-nat-gw-gw1-vpc-1-0 allocated
  724 I0619 12:07:04.101533       6 ipam.go:60] allocate v4 192.168.1.12, v6 , mac  for kube-system/vpc-nat-gw-gw1-vpc-1-0 from subnet ovn-vpc-external-network
  725 I0619 12:07:04.107169       6 pod.go:346] enqueue update pod kube-system/vpc-nat-gw-gw1-vpc-1-0
  726 I0619 12:07:04.107456       6 ipam.go:72] allocating static ip 10.0.1.254 from subnet net1-vpc-1
  727 E0619 12:07:04.107559       6 pod.go:1762] failed to get static ip 10.0.1.254, mac <nil>, subnet net1-vpc-1, err NoAvailableAddress
  728 I0619 12:07:04.107574       6 ipam.go:72] allocating static ip 10.0.1.254 from subnet ovn-default
  729 E0619 12:07:04.107581       6 ipam.go:89] failed to allocate static ip 10.0.1.254 for kube-system/vpc-nat-gw-gw1-vpc-1-0
  730 E0619 12:07:04.107585       6 pod.go:1762] failed to get static ip 10.0.1.254, mac <nil>, subnet ovn-default, err AddressOutOfRange
  731 E0619 12:07:04.107645       6 pod.go:620] AddressOutOfRange
  732 E0619 12:07:04.108065       6 pod.go:405] error syncing 'kube-system/vpc-nat-gw-gw1-vpc-1-0': AddressOutOfRange, requeuing
  733 I0619 12:07:04.108083       6 pod.go:550] handle add/update pod kube-system/vpc-nat-gw-gw1-vpc-1-0
  734 I0619 12:07:04.107846       6 event.go:298] Event(v1.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"vpc-nat-gw-gw1-vpc-1-0", UID:"d61e58b5-c8f8-4f79-86cc-4e6d8724f475", AP      IVersion:"v1", ResourceVersion:"2642", FieldPath:""}): type: 'Warning' reason: 'AcquireAddressFailed' AddressOutOfRange
  735 I0619 12:07:04.110791       6 pod.go:607] sync pod kube-system/vpc-nat-gw-gw1-vpc-1-0 allocated
  736 I0619 12:07:04.110947       6 ipam.go:60] allocate v4 192.168.1.13, v6 , mac  for kube-system/vpc-nat-gw-gw1-vpc-1-0 from subnet ovn-vpc-external-network
  737 I0619 12:07:04.116781       6 ipam.go:72] allocating static ip 10.0.1.254 from subnet net1-vpc-1
  738 E0619 12:07:04.116801       6 pod.go:1762] failed to get static ip 10.0.1.254, mac <nil>, subnet net1-vpc-1, err NoAvailableAddress
  739 I0619 12:07:04.116809       6 ipam.go:72] allocating static ip 10.0.1.254 from subnet ovn-default
  740 E0619 12:07:04.116901       6 ipam.go:89] failed to allocate static ip 10.0.1.254 for kube-system/vpc-nat-gw-gw1-vpc-1-0
  741 E0619 12:07:04.116916       6 pod.go:1762] failed to get static ip 10.0.1.254, mac <nil>, subnet ovn-default, err AddressOutOfRange
  742 E0619 12:07:04.117040       6 pod.go:620] AddressOutOfRange
author970 commented 3 weeks ago

恢复途径: 针对init状态的vpc网关pod,查询未释放的ip资源,查询方法(kubectl get ip -A | grep vpc-nat-gw名称),手动删除历史vpc网关pod 占用的ip(kubectl delete ip xxxx),先scale vpc网关sts replicas为0,待pod被删后,再scale sts 1即可

zhangzujian commented 3 weeks ago

@author970 Could you please try the latest v1.12 version? The image tags are v1.12.19-x86 and v1.12.19-arm.

author970 commented 3 weeks ago

@author970 Could you please try the latest v1.12 version? The image tags are v1.12.19-x86 and v1.12.19-arm.

OK. When we subsequently upgrade kube-ovn to v1.12.19 and subsequent versions, we will conduct verification and reply with the verification results.