alibaba / open-local

cloud-native local storage management system for stateful workload, low-latency with simplicity
Apache License 2.0
465 stars 81 forks source link

[bug] NodeUnpublishVolume return error when targetpath exists #118

Open TheBeatles1994 opened 2 years ago

TheBeatles1994 commented 2 years ago

函数此处存在较大的问题,参照k8s源码:

func (c *csiMountMgr) TearDownAt(dir string) error {
    klog.V(4).Infof(log("Unmounter.TearDown(%s)", dir))

    volID := c.volumeID
    csi, err := c.csiClientGetter.Get()
    if err != nil {
        return errors.New(log("mounter.SetUpAt failed to get CSI client: %v", err))
    }

    ctx, cancel := context.WithTimeout(context.Background(), csiTimeout)
    defer cancel()

    if err := csi.NodeUnpublishVolume(ctx, volID, dir); err != nil {
        return errors.New(log("mounter.TearDownAt failed: %v", err))
    }

    // clean mount point dir
    if err := removeMountDir(c.plugin, dir); err != nil {
        return errors.New(log("mounter.TearDownAt failed to clean mount dir [%s]: %v", dir, err))
    }
    klog.V(4).Infof(log("mounter.TearDownAt successfully unmounted dir [%s]", dir))

    return nil
}

如果 NodeUnpublishVolume 函数失败,则永远不会执行 removeMountDir 函数。

TheBeatles1994 commented 2 years ago

参考社区其他的 csi 实现,发现对于同一个 volumeID,需用锁来控制先后顺序,可参考 pmem-csi

TheBeatles1994 commented 2 years ago

nodeServer中的两个mounter,是否可以变为一个。

TheBeatles1994 commented 2 years ago

NodeUnpublishVolume在删除临时卷的lv时并不严谨,且若umount成功后没有删除临时卷lv,则永远删除不掉了,需要有一种机制保证一定能删除成功。