flux-framework / flux-k8s

Project to manage Flux tasks needed to standardize kubernetes HPC scheduling interfaces
Apache License 2.0
21 stars 10 forks source link

Segmentation Fault in ReapiCliInit #7

Closed cmisale closed 2 years ago

cmisale commented 3 years ago

Happening sometimes during initialization

[signal SIGSEGV: segmentation violation code=0x2 addr=0x499da8 pc=0x19d1768]

runtime stack:
runtime.throw(0x1ee0b88, 0x2a)
        /usr/local/go/src/runtime/panic.go:1116 +0x72
runtime.sigpanic()
        /usr/local/go/src/runtime/signal_unix.go:704 +0x4ac

goroutine 1 [syscall]:
runtime.cgocall(0x19cb4f3, 0xc000712da0, 0x1896f)
        /usr/local/go/src/runtime/cgocall.go:133 +0x5b fp=0xc000712d70 sp=0xc000712d38 pc=0x460e3b
fluxcli._Cfunc_reapi_cli_initialize(0x7f7a64000cf0, 0x7f7a2c000b60, 0x0)
        _cgo_gotypes.go:161 +0x4d fp=0xc000712da0 sp=0xc000712d70 pc=0x19b636d
fluxcli.ReapiCliInit.func1(0x7f7a64000cf0, 0xc000d34000, 0x1896f, 0xc000d34000)
        /go/src/sigs.k8s.io/scheduler-plugins/vendor/fluxcli/reapi_cli.go:35 +0x7d fp=0xc000712dd8 sp=0xc000712da0 pc=0x19b6abd
fluxcli.ReapiCliInit(0x7f7a64000cf0, 0xc000d34000, 0x1896f, 0xc000d34000)
        /go/src/sigs.k8s.io/scheduler-plugins/vendor/fluxcli/reapi_cli.go:35 +0x3f fp=0xc000712e08 sp=0xc000712dd8 pc=0x19b66ff
sigs.k8s.io/scheduler-plugins/pkg/kubeflux.New(0x0, 0x0, 0x2149000, 0xc0001e9520, 0x0, 0x0, 0x0, 0x0)
        /go/src/sigs.k8s.io/scheduler-plugins/pkg/kubeflux/kubeflux.go:138 +0x18a fp=0xc000712ee8 sp=0xc000712e08 pc=0x19bd5ca
k8s.io/kubernetes/pkg/scheduler/framework/runtime.NewFramework(0xc000db8150, 0xc0004b26c0, 0x0, 0x0, 0x0, 0xc000dfed80, 0x8, 0xc, 0x215a100, 0x817ab4b8b4242ccd, ...)
        /go/src/sigs.k8s.io/scheduler-plugins/vendor/k8s.io/kubernetes/pkg/scheduler/framework/runtime/framework.go:297 +0x866 fp=0xc000713368 sp=0xc000712ee8 pc=0x18a9f06
k8s.io/kubernetes/pkg/scheduler/profile.newProfile(0xc0003216e0, 0x11, 0xc0004b26c0, 0x0, 0x0, 0x0, 0xc000db8150, 0xc00031a330, 0xc0007137c0, 0x6, ...)
        /go/src/sigs.k8s.io/scheduler-plugins/vendor/k8s.io/kubernetes/pkg/scheduler/profile/profile.go:41 +0x12c fp=0xc0007133e8 sp=0xc000713368 pc=0x192c26c
k8s.io/kubernetes/pkg/scheduler/profile.NewMap(0xc000435620, 0x2, 0x2, 0xc000db8150, 0xc00031a330, 0xc0007137c0, 0x6, 0x6, 0xb2b7905169e345a7, 0xc000600c00, ...)
        /go/src/sigs.k8s.io/scheduler-plugins/vendor/k8s.io/kubernetes/pkg/scheduler/profile/profile.go:61 +0x1b5 fp=0xc000713550 sp=0xc0007133e8 pc=0x192c515
k8s.io/kubernetes/pkg/scheduler.(*Configurator).create(0xc000713aa0, 0xc000301920, 0x1eb1481, 0xf)
        /go/src/sigs.k8s.io/scheduler-plugins/vendor/k8s.io/kubernetes/pkg/scheduler/factory.go:135 +0xc0c fp=0xc000713800 sp=0xc000713550 pc=0x199f36c
k8s.io/kubernetes/pkg/scheduler.(*Configurator).createFromProvider(0xc000fa7aa0, 0x1eb1481, 0xf, 0xc0003018c0, 0x2, 0x2)
        /go/src/sigs.k8s.io/scheduler-plugins/vendor/k8s.io/kubernetes/pkg/scheduler/factory.go:201 +0x23d fp=0xc0007138d8 sp=0xc000713800 pc=0x199febd
k8s.io/kubernetes/pkg/scheduler.New(0x215df80, 0xc000158f20, 0x2159100, 0xc0006d61e0, 0xc00031a330, 0xc000150b40, 0xc000fa7c30, 0x9, 0x9, 0x1c9fce0, ...)
        /go/src/sigs.k8s.io/scheduler-plugins/vendor/k8s.io/kubernetes/pkg/scheduler/scheduler.go:237 +0x785 fp=0xc000713b50 sp=0xc0007138d8 pc=0x19a2405
k8s.io/kubernetes/cmd/kube-scheduler/app.Setup(0x2135400, 0xc000a877c0, 0xc000137ba0, 0xc000114e88, 0x1, 0x1, 0x47, 0x47, 0x47, 0x46)
        /go/src/sigs.k8s.io/scheduler-plugins/vendor/k8s.io/kubernetes/cmd/kube-scheduler/app/server.go:322 +0x50f fp=0xc000713c88 sp=0xc000713b50 pc=0x19b536f
k8s.io/kubernetes/cmd/kube-scheduler/app.runCommand(0xc0004238c0, 0xc000137ba0, 0xc000114e88, 0x1, 0x1, 0x0, 0x0)
[...]

References issue here

milroy commented 2 years ago

Fixed in https://github.com/flux-framework/flux-sched/pull/888