puppetlabs / puppet

Server automation framework and application
https://puppet.com/open-source/#osp
Apache License 2.0
7.37k stars 2.19k forks source link

Update splaylimit during daemon run #9415

Open ImpBY opened 1 month ago

ImpBY commented 1 month ago

This issue was originally filed due to a regression after merging https://github.com/puppetlabs/puppet/pull/9345 and released in 8.7.0/7.31.0. The change was reverted in https://github.com/puppetlabs/puppet/issues/9415 and released in 8.8.1 and 7.32.1. Since this issue contains a possible fix for the regression, we're repurposing this ticket for the original issue described in PUP-11728.

Describe the Bug

splay is recalculated even if the splay_limit has not changed https://puppetcommunity.slack.com/archives/C0W298S9G/p1721059192632569

Expected Behavior

this leads to the fact that the probability of the first launch of the puppet increases over time. upon reaching 1/3 of the time from the splay_limit it becomes almost 100%. as a result the agent will perform the first run within 1/3 of the splay_limit. by default it is 10 min (splay_limit 30 min)

Steps to Reproduce

Steps to reproduce the behavior:

  1. add code to lib/puppet/scheduler/splay_job.rb

    def ready?(time)
     def ready?(time)
    +      File.open("/tmp/splay", "a+") do |f|
    +        f.write("splay: #{@splay}\n")
    +        f.close
    +      end
      if last_run
        super
      else
        start_time + splay <= time
      end
    end
  2. watch realtime changes of splay

    
    # systemctl restart puppet
    # tail -f /tmp/splay | awk -e '{ print strftime("%Y-%m-%d_%H:%M:%S",systime()) "\t" $0}'
    2024-07-16_10:48:40 splay: 532
    2024-07-16_10:48:44 splay: 532
    2024-07-16_10:48:44 splay: 532
    2024-07-16_10:48:49 splay: 532
    2024-07-16_10:48:49 splay: 532
    2024-07-16_10:48:53 splay: 1065
    2024-07-16_10:48:53 splay: 1065
    2024-07-16_10:48:54 splay: 1065
    2024-07-16_10:48:54 splay: 1065
    2024-07-16_10:48:59 splay: 1065
    2024-07-16_10:48:59 splay: 1065
    2024-07-16_10:49:04 splay: 1065
    2024-07-16_10:49:04 splay: 1065
    2024-07-16_10:49:08 splay: 847
    2024-07-16_10:49:08 splay: 847
    2024-07-16_10:49:09 splay: 847
    2024-07-16_10:49:09 splay: 847
    ^C

## Environment
 - Version 7.31.0
 - Platform Oracle Linux Server 9.4 (5.15.0-207.156.6.el9uek.x86_64)

## Additional Context
suggested patch:

diff --git lib/puppet/scheduler/splay_job.rb lib/puppet/scheduler/splay_job.rb index b44e08bad6..d2a5643324 100644 --- lib/puppet/scheduler/splay_job.rb +++ lib/puppet/scheduler/splay_job.rb @@ -1,6 +1,7 @@ module Puppet::Scheduler class SplayJob < Job attr_reader :splay

ImpBY commented 1 month ago

image jruby consumption before and after patch

~1200 agents

server params

# (optional) maximum number of JRuby instances to allow
max-active-instances: 32
max-queued-requests: 64
max-retry-delay: 120 ## seconds before retry

agents params

splay = true
splaylimit = 30m
runinterval = 30m
sharewax commented 1 month ago

issue has been added by this pull request https://github.com/puppetlabs/puppet/pull/9345

sharewax commented 1 month ago

выява how it looks like before/after and downgrade.

github-actions[bot] commented 1 month ago

Migrated issue to PUP-12061