devtron-labs / devtron

The only Kubernetes dashboard you need
https://devtron.ai
Apache License 2.0
4.49k stars 485 forks source link

Bug: Unhandled panic in orchestrator #5524

Closed Neha130 closed 4 months ago

Neha130 commented 4 months ago

📜 Description

encountering orchestrator restarts and panic issues logs are attaching below:


9829-{"date":"2024-07-11T03:13:17.547653Z","time":"2024-07-11T03:13:17.547653259Z","stream":"stderr","_p":"F","log":"{\"level\":\"error\",\"ts\":1720667597.5474544,\"caller\":\"pubsub-lib/PubSubClientService.go:165\",\"msg\":\"error while subscribing to nats \",\"stream\":\"GIT-SENSOR\",\"topic\":\"NEW-CI-MATERIAL\",\"error\":\"nats: consumer is offline\",\"stacktrace\":\"github.com/devtron-labs/common-lib/pubsub-lib.PubSubClientServiceImpl.Subscribe\\n\\t/go/src/github.com/devtron-labs/devtron/vendor/github.com/devtron-labs/common-lib/pubsub-lib/PubSubClientService.go:165\\ngithub.com/devtron-labs/devtron/pkg/eventProcessor/in.(*CIPipelineEventProcessorImpl).SubscribeNewCIMaterialEvent\\n\\t/go/src/github.com/devtron-labs/devtron/pkg/eventProcessor/in/CIPipelineEventProcessorService.go:59\\ngithub.com/devtron-labs/devtron/pkg/eventProcessor.(*CentralEventProcessor).SubscribeAll\\n\\t/go/src/github.com/devtron-labs/devtron/pkg/eventProcessor/CentralEventProcessorService.go:50\\ngithub.com/devtron-labs/devtron/pkg/eventProcessor.NewCentralEventProcessor\\n\\t/go/src/github.com/devtron-labs/devtron/pkg/eventProcessor/CentralEventProcessorService.go:39\\nmain.InitializeApp\\n\\t/go/src/github.com/devtron-labs/devtron/wire_gen.go:1158\\nmain.main\\n\\t/go/src/github.com/devtron-labs/devtron/main.go:28\\nruntime.main\\n\\t/usr/local/go/src/runtime/proc.go:267\"}"}
11206-{"date":"2024-07-11T03:13:17.547681Z","time":"2024-07-11T03:13:17.547681589Z","stream":"stderr","_p":"F","log":"{\"level\":\"error\",\"ts\":1720667597.5475883,\"caller\":\"in/CIPipelineEventProcessorService.go:61\",\"msg\":\"errnats: consumer is offline\",\"stacktrace\":\"github.com/devtron-labs/devtron/pkg/eventProcessor/in.(*CIPipelineEventProcessorImpl).SubscribeNewCIMaterialEvent\\n\\t/go/src/github.com/devtron-labs/devtron/pkg/eventProcessor/in/CIPipelineEventProcessorService.go:61\\ngithub.com/devtron-labs/devtron/pkg/eventProcessor.(*CentralEventProcessor).SubscribeAll\\n\\t/go/src/github.com/devtron-labs/devtron/pkg/eventProcessor/CentralEventProcessorService.go:50\\ngithub.com/devtron-labs/devtron/pkg/eventProcessor.NewCentralEventProcessor\\n\\t/go/src/github.com/devtron-labs/devtron/pkg/eventProcessor/CentralEventProcessorService.go:39\\nmain.InitializeApp\\n\\t/go/src/github.com/devtron-labs/devtron/wire_gen.go:1158\\nmain.main\\n\\t/go/src/github.com/devtron-labs/devtron/main.go:28\\nruntime.main\\n\\t/usr/local/go/src/runtime/proc.go:267\"}"}
12279-{"date":"2024-07-11T03:13:17.547774Z","time":"2024-07-11T03:13:17.547774301Z","stream":"stderr","_p":"F","log":"{\"level\":\"error\",\"ts\":1720667597.5476637,\"caller\":\"eventProcessor/CentralEventProcessorService.go:52\",\"msg\":\"error, SubscribeNewCIMaterialEvent\",\"err\":\"nats: consumer is offline\",\"stacktrace\":\"github.com/devtron-labs/devtron/pkg/eventProcessor.(*CentralEventProcessor).SubscribeAll\\n\\t/go/src/github.com/devtron-labs/devtron/pkg/eventProcessor/CentralEventProcessorService.go:52\\ngithub.com/devtron-labs/devtron/pkg/eventProcessor.NewCentralEventProcessor\\n\\t/go/src/github.com/devtron-labs/devtron/pkg/eventProcessor/CentralEventProcessorService.go:39\\nmain.InitializeApp\\n\\t/go/src/github.com/devtron-labs/devtron/wire_gen.go:1158\\nmain.main\\n\\t/go/src/github.com/devtron-labs/devtron/main.go:28\\nruntime.main\\n\\t/usr/local/go/src/runtime/proc.go:267\"}"}
13184-{"date":"2024-07-11T03:13:17.547779Z","time":"2024-07-11T03:13:17.547779552Z","stream":"stderr","_p":"F","log":"2024/07/11 03:13:17 nats: consumer is offline"}
13344:{"date":"2024-07-11T03:13:17.549921Z","time":"2024-07-11T03:13:17.549921917Z","stream":"stderr","_p":"F","log":"panic: nats: consumer is offline"}
13491-{"date":"2024-07-11T03:13:17.549925Z","time":"2024-07-11T03:13:17.549925828Z","stream":"stderr","_p":"F","log":""}
13606-{"date":"2024-07-11T03:13:17.549928Z","time":"2024-07-11T03:13:17.549928178Z","stream":"stderr","_p":"F","log":"goroutine 1 [running]:"}
13743:{"date":"2024-07-11T03:13:17.549930Z","time":"2024-07-11T03:13:17.549930028Z","stream":"stderr","_p":"F","log":"log.Panic({0xc00121dee0?, 0x1ebfc9c?, 0xc0000061a0?})"}
13911-{"date":"2024-07-11T03:13:17.549936Z","time":"2024-07-11T03:13:17.549936409Z","stream":"stderr","_p":"F","log":"\t/usr/local/go/src/log/log.go:432 +0x5a"}
14066-{"date":"2024-07-11T03:13:17.549939Z","time":"2024-07-11T03:13:17.549939428Z","stream":"stderr","_p":"F","log":"main.main()"}
14192-{"date":"2024-07-11T03:13:17.549941Z","time":"2024-07-11T03:13:17.549941638Z","stream":"stderr","_p":"F","log":"\t/go/src/github.com/devtron-labs/devtron/main.go:30 +0xff"}
14365-{"date":"2024-07-11T03:13:37.653040Z","time":"2024-07-11T03:13:37.653040451Z","stream":"stderr","_p":"F","log":"{\"level\":\"info\",\"ts\":1720667617.6528556,\"caller\":\"sql/connection.go:58\",\"msg\":\"connected with db\",\"db\":{\"Addr\":\"khelgroup-postgresql.devtroncd\",\"Port\":\"5432\",\"User\":\"postgres\",\"Password\":\"********\",\"Database\":\"orchestrator\",\"CasbinDatabase\":\"casbin\",\"ApplicationName\":\"orchestrator\",\"LogQuery\":false,\"LogAllQuery\":false,\"ExportPromMetrics\":false,\"QueryDurationThreshold\":5000,\"ReadTimeout\":30,\"WriteTimeout\":30}}"}
14948-{"date":"2024-07-11T03:13:37.684809Z","time":"2024-07-11T03:13:37.684809274Z","stream":"stderr","_p":"F","log":"2024/07/11 03:13:37 v2 casbin Policies Loaded Successfully"}
15121-{"date":"2024-07-11T03:13:37.684983Z","time":"2024-07-11T03:13:37.684983278Z","stream":"stderr","_p":"F","log":"{\"level\":\"info\",\"ts\":1720667617.684898,\"caller\":\"casbin/rbac.go:96\",\"msg\":\"enforce cache enabled\",\"expiry\":432000}"}
15366-{"date":"2024-07-11T03:13:37.685206Z","time":"2024-07-11T03:13:37.685206053Z","stream":"stderr","_p":"F","log":"{\"level\":\"info\",\"ts\":1720667617.6851134,\"caller\":\"casbin/RbacEnterprise.go:61\",\"msg\":\"enforcer initialized\",\"Config\":{\"EnterpriseEnforcerEnabled\":true,\"UseCustomEnforcer\":true,\"UseCasbinV2\":true,\"CustomRoleCacheAllowed\":true}}"}
15731-{"date":"2024-07-11T03:13:37.736502Z","time":"2024-07-11T03:13:37.736502506Z","stream":"stderr","_p":"F","log":"{\"level\":\"info\",\"ts\":1720667617.7363505,\"caller\":\"cluster/EnvironmentRestHandler.go:76\",\"msg\":\"evironment rest handler initialized\",\"ignoreAuthCheckValue\":false}"}
16023-{"date":"2024-07-11T03:13:37.737473Z","time":"2024-07-11T03:13:37.737473137Z","stream":"stderr","_p":"F","log":"{\"level\":\"info\",\"ts\":1720667617.7373261,\"caller\":\"git/GitFactory.go:39\",\"msg\":\"reloading gitops details\"}"}
16257-{"date":"2024-07-11T03:13:37.738032Z","time":"2024-07-11T03:13:37.738032729Z","stream":"stderr","_p":"F","log":"{\"level\":\"info\",\"ts\":1720667617.737932,\"caller\":\"variables/ScopedVariableService.go:126\",\"msg\":\"variable cache loaded successfully\"}"}
16518-{"date":"2024-07-11T03:13:37.738038Z","time":"2024-07-11T03:13:37.738038509Z","stream":"stderr","_p":"F","log":"{\"level\":\"info\",\"ts\":1720667617.7379572,\"caller\":\"git/GitFactory.go:73\",\"msg\":\" gitops details reload success\"}"}
16758-{"date":"2024-07-11T03:13:37.740568Z","time":"2024-07-11T03:13:37.740568603Z","stream":"stderr","_p":"F","log":"W0711 03:13:37.740443       1 reflector.go:424] k8s.io/client-go/informers/factory.go:150: failed to list *v1.Namespace: Get \"https://load-test-eks.rummyverse.link/api/v1/namespaces?limit=500&resourceVersion=0\": dial tcp: lookup load-test-eks.rummyverse.link on 10.100.0.10:53: no such host"}
17165-{"date":"2024-07-11T03:13:37.740587Z","time":"2024-07-11T03:13:37.740587904Z","stream":"stderr","_p":"F","log":"E0711 03:13:37.740508       1 reflector.go:140] k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.Namespace: failed to list *v1.Namespace: Get \"https://load-test-eks.rummyverse.link/api/v1/namespaces?limit=500&resourceVersion=0\": dial tcp: lookup load-test-eks.rummyverse.link on 10.100.0.10:53: no such host"}
17603-{"date":"2024-07-11T03:13:37.741397Z","time":"2024-07-11T03:13:37.741397191Z","stream":"stderr","_p":"F","log":"{\"level\":\"info\",\"ts\":1720667617.7413034,\"caller\":\"gitSensor/GitSensorClient.go:54\",\"msg\":\"using gRPC api client for git sensor\"}"}
17860-{"date":"2024-07-11T03:13:37.741935Z","time":"2024-07-11T03:13:37.741935523Z","stream":"stderr","_p":"F","log":"{\"level\":\"info\",\"ts\":1720667617.741836,\"caller\":\"lockConfiguration/LockConfigurationService.go:52\",\"msg\":\"env var \",\"ARRAY_DIFF_MEMOIZATION\":false}"}
18138-{"date":"2024-07-11T03:13:37.742459Z","time":"2024-07-11T03:13:37.742459084Z","stream":"stderr","_p":"F","log":"{\"level\":\"info\",\"ts\":1720667617.7423282,\"caller\":\"authentication/UserAuthOidcHelper.go:48\",\"msg\":\"auth starting with dex conf\",\"conf\":{\"DexHost\":\"http://argocd-dex-server.devtroncd\",\"DexPort\":\"5556\",\"DexClientID\":\"argo-cd\",\"DexServerAddress\":\"http://argocd-dex-server.devtroncd:5556\",\"Url\":\"https://devtron.rummyverse.link/orchestrator\",\"DexClientSecret\":\"wdC3CPAeiA6h6FtpqcH2GW-IxfG-Qf20rSIyCp_O\",\"ServerSecret\":\"JC1KIbJa0qyIOl9uwwNc7n3DXiSjLuEXiuPqUYaywYA=\",\"UserSessionDurationSeconds\":86400,\"ADMIN_PASSWORD_MTIME\":\"0001-01-01T00:00:00Z\",\"DexConfigRaw\":\"\",\"DevtronSecretName\":\"orchestrator-secrets-3\"}}"}
18916-{"date":"2024-07-11T03:13:37.742474Z","time":"2024-07-11T03:13:37.742474365Z","stream":"stderr","_p":"F","log":"time=\"2024-07-11T03:13:37Z\" level=info msg=\"proxy server address:  http://argocd-dex-server.devtroncd:5556\""}
19142-{"date":"2024-07-11T03:13:37.742477Z","time":"2024-07-11T03:13:37.742477735Z","stream":"stderr","_p":"F","log":"time=\"2024-07-11T03:13:37Z\" level=info msg=\"Creating client app (argo-cd)\""}
19335-{"date":"2024-07-11T03:13:37.742548Z","time":"2024-07-11T03:13:37.742548516Z","stream":"stderr","_p":"F","log":"{\"level\":\"info\",\"ts\":1720667617.742485,\"caller\":\"protect/ResourceProtectionService.go:38\",\"msg\":\"registering listener *drafts.ConfigDraftServiceImpl\"}"}
19614-{"date":"2024-07-11T03:13:37.742695Z","time":"2024-07-11T03:13:37.742695029Z","stream":"stderr","_p":"F","log":"{\"level\":\"info\",\"ts\":1720667617.7426453,\"caller\":\"team/TeamRestHandler.go:64\",\"msg\":\"team rest handler initialized\",\"ignoreAuthCheckValue\":false}"}```

### Affected areas

Other CRITICAL functionality

### Additional affected areas

Other CRITICAL functionality

### Prod/Non-prod environments?

Prod

### Is User unblocked?

No

### How was the user un-blocked?

None

### Impact on Enterprise

Since the orchestrator get restarted, devtron dashboard get disrupted

### 👟 Steps to replicate the Issue

NA

### 👍 Expected behavior

i) panic should be handled so that orchestrator doesn't restarts
ii) counter for panic should increase 
iii) panic should be fixed

### 👎 Actual Behavior

orchestrator restarts on panic in this particular scenario.

### ☸ Kubernetes version

any

### Cloud provider

any

### 🌍 Browser

Chrome

### ✅ Proposed Solution

_No response_

### 👀 Have you spent some time to check if this issue has been raised before?

- [X] I checked and didn't find any similar issue

### 🏢 Have you read the Code of Conduct?

- [X] I have read the [Code of Conduct](https://github.com/devtron-labs/devtron/blob/main/CODE_OF_CONDUCT.md)

AB#10190
github-actions[bot] commented 4 months ago

Final Score: 240

azure-boards[bot] commented 4 months ago

❌ There was a problem linking to Azure Boards work item(s):

Please check the IDs and try again using the AB# syntax. Learn more

Neha130 commented 4 months ago

there was cluster upgrade activity