kubevirt / device-plugin-manager

Incubating: A framework for writing Kubernetes Device Plugins
MIT License
28 stars 18 forks source link

error from fsnotify.NewWatcher() not checked, causing panic #29

Open blumamir opened 4 months ago

blumamir commented 4 months ago

What happened: I got a panic in my application when using this code:

    manager := dpm.NewManager(lister)
    manager.Run()

StackTrace:

panic: runtime error: invalid memory address or nil pointer dereference
    panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1108918]

goroutine 49 [running]:
github.com/fsnotify/fsnotify.(*Watcher).Close(0x4000d039d8?)
    /go/pkg/mod/github.com/fsnotify/fsnotify@v1.7.0/backend_inotify.go:305 +0x18
panic({0x148e3c0?, 0x281dee0?})
    /usr/local/go/src/runtime/panic.go:914 +0x218
github.com/fsnotify/fsnotify.(*Watcher).isClosed(...)
    /go/pkg/mod/github.com/fsnotify/fsnotify@v1.7.0/backend_inotify.go:296
github.com/fsnotify/fsnotify.(*Watcher).AddWith(0x0, {0x1739526, 0x20}, {0x0, 0x0, 0x14d2840?})
    /go/pkg/mod/github.com/fsnotify/fsnotify@v1.7.0/backend_inotify.go:372 +0x2c
github.com/fsnotify/fsnotify.(*Watcher).Add(...)
    /go/pkg/mod/github.com/fsnotify/fsnotify@v1.7.0/backend_inotify.go:362
github.com/kubevirt/device-plugin-manager/pkg/dpm.(*Manager).Run(0x4000b9e1f0)
    /go/pkg/mod/github.com/kubevirt/device-plugin-manager@v1.19.5/pkg/dpm/manager.go:55 +0x200
main.startDeviceManager(0x0?)
    /go/src/github.com/keyval-dev/odigos/odiglet/cmd/main.go:110 +0x48c
created by main.main in goroutine 1
    /go/src/github.com/keyval-dev/odigos/odiglet/cmd/main.go:50 +0x2bc

After investigating, I found that Manager Run() function is attempting to create new watcher, but ignores the error coming back from this call.

    fsWatcher, _ := fsnotify.NewWatcher()
    defer fsWatcher.Close()
    fsWatcher.Add(pluginapi.DevicePluginPath)

In my case, the call fails, and fsWatcher is nil which causes the above panic.

How to reproduce it (as minimally and precisely as possible):

To reproduce, open the k8s node terminal:

sudo sysctl -w fs.inotify.max_user_instances=1

Then, try to create a manager and it should panic as above

Additional context: I think the Run() function should return an error when the fsnotify.NewWatcher() fails

Environment:

kubevirt-bot commented 1 month ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

kubevirt-bot commented 1 week ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle rotten