smartxworks / cluster-api-provider-elf-static-ip

1 stars 2 forks source link

SKS-1243: Save MachineStaticIPFinalizer first and then allocate IP #26

Closed haijianyang closed 1 year ago

haijianyang commented 1 year ago

SKS-1243

删除集群的时候,偶尔出现 IP 没有释放。

如图所示,最后两行日志可以看出,当 CAPE 移除了 MachineFinalizer 之后,ElfMachine 就直接 404 了,所以没有执行释放 IP 的逻辑。

image

可能存在的原因:

  1. Devices 被修改了,被判断为不需要释放 IP 。

    func (r *ElfMachineReconciler) reconcileDelete(ctx *context.MachineContext) (reconcile.Result, error) {
    if !ipamutil.HasStaticIPDevice(ctx.ElfMachine.Spec.Network.Devices) {
        ctrlutil.RemoveFinalizer(ctx.ElfMachine, MachineStaticIPFinalizer)
        return ctrl.Result{}, nil
    }
    }
  2. MachineStaticIPFinalizer 被移除了,所以当 MachineFinalizer 被移除之后,ElfMachine 马上被删除了。

目前不能完全确定具体原因,针对上述分析,在释放 IP 的逻辑增加了更多的日志信息。

并在分配 IP 之前,先设置并保存 MachineStaticIPFinalizer。这样可以避免在分配了 IP 但 MachineStaticIPFinalizer 没有保存的时候删除 Machine,这个时候可能会导致因为没有 MachineStaticIPFinalizer 而不会走释放 IP 的逻辑。

测试

使用 IP Pool 创建多次集群,分别在随意时间删除集群,IP 均被释放。

codecov[bot] commented 1 year ago

Codecov Report

Merging #26 (91c3cfd) into main (339642c) will increase coverage by 1.81%. The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main      #26      +/-   ##
==========================================
+ Coverage   64.28%   66.10%   +1.81%     
==========================================
  Files           2        2              
  Lines         280      295      +15     
==========================================
+ Hits          180      195      +15     
  Misses         85       85              
  Partials       15       15              
Impacted Files Coverage Δ
controllers/elfmachine_controller.go 78.44% <100.00%> (+1.48%) :arrow_up:

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.