networkservicemesh / nsm-operator

A kubernetes operator for deploying and managing Network Service Meshes
Apache License 2.0
9 stars 8 forks source link

nsm-operator pod restart caused by segmentation violation #51

Closed szvincze closed 2 years ago

szvincze commented 2 years ago

Describe the bug Sometimes nsm-operator pod restarts when deploying the config/samples/nsm_v1alpha1_nsm-registry-k8s.yaml example. The cause of the issue is a panic because of a segmentation violation (please see the attached log for more details):

panic:` runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x12c6dae]

To Reproduce I observed the fault during development of an integration test case, but it was quite rare. However it can be reproduced by deploying and deleting the above mentioned example by the following commands several times in a row:

kubectl apply -f config/samples/nsm_v1alpha1_nsm-registry-k8s.yaml
kubectl delete -n nsm nsm nsm-sample-registry-k8s

Expected behavior No pod restart.

Screenshots, CLI output and logs nsm-operator-pod-restart.log

Desktop (please complete the following information):

Additional context

szvincze commented 2 years ago

Here is the proposed fix for the issue.

I managed to reproduce the situation by the method mentioned in the above comment and now the log contains the proper info and there is no pod restart:

INFO    controller.nsm  Failed to update status {"reconciler group": "nsm.networkservicemesh.io", "reconciler kind": "NSM", "name": "nsm-sample-registry-k8s", "namespace": "nsm", "nsm": "nsm/nsm-sample-registry-k8s", "Error": "Operation cannot be fulfilled on nsms.nsm.networkservicemesh.io \"nsm-sample-registry-k8s\": the object has been modified; please apply your changes to the latest version and try again"}
szvincze commented 2 years ago

Fix is merged via PR#53.