puppetlabs / puppetserver

Server automation framework and application
https://tickets.puppetlabs.com/browse/SERVER
Apache License 2.0
292 stars 235 forks source link

puppetserver gets trapped in exception cycle and uses 100% CPU #2845

Closed rwaffen closed 3 months ago

rwaffen commented 5 months ago

Describe the Bug

Puppetserver encounters an exception (outlined below) and becomes trapped in a repetitive cycle, unable to recover autonomously. To restore normal functionality, a restart is required. During this loop, it continuously logs the exception, consuming 100% of the CPU resources available. This issue occurs sporadically, with no discernible pattern evident at present.

2024-03-28T12:00:21.123+01:00 ERROR [clojure-agent-send-off-pool-83168] [p.t.s.s.status-core] #error {
 :cause nil
 :via
 [{:type java.util.concurrent.CancellationException
   :message nil
   :at [java.util.concurrent.FutureTask report FutureTask.java 121]}]
 :trace
 [[java.util.concurrent.FutureTask report FutureTask.java 121]
  [java.util.concurrent.FutureTask get FutureTask.java 191]
  [clojure.core$deref_future invokeStatic core.clj 2317]
  [clojure.core$future_call$reify__8544 deref core.clj 7041]
  [clojure.core$deref invokeStatic core.clj 2337]
  [clojure.core$deref invoke core.clj 2323]
  [puppetlabs.trapperkeeper.services.status.status_core$fn__28557$guarded_status_fn_call__28562$fn__28563$fn__28573 invoke status_core.clj 377]
  [puppetlabs.trapperkeeper.services.status.status_core$fn__28557$guarded_status_fn_call__28562$fn__28563 invoke status_core.clj 377]
  [puppetlabs.trapperkeeper.services.status.status_core$fn__28557$guarded_status_fn_call__28562 invoke status_core.clj 359]
  [puppetlabs.trapperkeeper.services.status.status_core$fn__28659$call_status_fn_for_service__28668$fn__28671 invoke status_core.clj 439]
  [puppetlabs.trapperkeeper.services.status.status_core$fn__28659$call_status_fn_for_service__28668 invoke status_core.clj 421]
  [puppetlabs.trapperkeeper.services.status.status_core$fn__28659$call_status_fn_for_service__28668$fn__28669 invoke status_core.clj 432]
  [puppetlabs.trapperkeeper.services.status.status_core$fn__28659$call_status_fn_for_service__28668 invoke status_core.clj 421]
  [puppetlabs.trapperkeeper.services.status.status_core$fn__28697$call_status_fns__28702$fn__28703$fn__28705 invoke status_core.clj 459]
  [clojure.core$pmap$fn__8552$fn__8553 invoke core.clj 7089]
  [clojure.core$binding_conveyor_fn$fn__5823 invoke core.clj 2047]
  [clojure.lang.AFn call AFn.java 18]
  [java.util.concurrent.FutureTask run FutureTask.java 264]
  [java.util.concurrent.ThreadPoolExecutor runWorker ThreadPoolExecutor.java 1128]
  [java.util.concurrent.ThreadPoolExecutor$Worker run ThreadPoolExecutor.java 628]
  [java.lang.Thread run Thread.java 829]]}
2024-03-28T12:00:21.180+01:00 ERROR [clojure-agent-send-off-pool-82816] [p.t.s.s.status-core] Status callback for puppet-profiler timed out, shutting down background task

Environment

jonathannewman commented 4 months ago

@rwaffen I would encourage you to open a support ticket for this issue. The CPU usage may be unrelated to that stack crawl (meaning the stack crawl is a symptom of resource exhaustion).

rwaffen commented 4 months ago

Hi @jonathannewman with the customer we created Case# 01144024

bastelfreak commented 3 months ago

@justinstoller is that issue actually fixed?