chef / chef-server

Chef Infra Server is a hub for configuration data; storing cookbooks, node policies and metadata of managed nodes.
https://www.chef.io/chef/
Apache License 2.0
290 stars 210 forks source link

Cookbook with invalid dependencies causes ALL Chef client runs to begin failing (even on nodes that do not use the cookbook in question) #3615

Open carlosaya opened 1 year ago

carlosaya commented 1 year ago

Chef Server Version

Standalone CINC Server 14.15.10

Platform Details

RHEL 8,7 on-prem

# free -h
              total        used        free      shared  buff/cache   available
Mem:          7.6Gi       3.2Gi       298Mi       457Mi       4.1Gi       3.7Gi
Swap:         2.0Gi       1.1Gi       892Mi

# df -h
Filesystem                   Size  Used Avail Use% Mounted on
devtmpfs                     3.8G     0  3.8G   0% /dev
tmpfs                        3.8G   44K  3.8G   1% /dev/shm
tmpfs                        3.8G  385M  3.5G  10% /run
tmpfs                        3.8G     0  3.8G   0% /sys/fs/cgroup
/dev/mapper/rootvg-rootlv     10G  2.1G  8.0G  21% /
/dev/mapper/rootvg-homelv    7.0G   83M  7.0G   2% /home
/dev/mapper/rootvg-tmplv      27G  232M   27G   1% /tmp
/dev/mapper/rootvg-optlv      31G  1.6G   30G   5% /opt
/dev/mapper/rootvg-varlv     146G   12G  135G   8% /var
/dev/sda1                   1014M  278M  737M  28% /boot
/dev/mapper/rootvg-varloglv   15G  3.2G   12G  21% /var/log
tmpfs                        777M     0  777M   0% /run/user/510079027

Configuration

Standalone, recent migration from long-standing Chef Server to CINC

Scenario

Cookbook is pushed to CINC server with potentially unresolvable metadata.rb dependencies.

Steps to Reproduce

Cookbook metadata.rb contained the following...

chef_version '>= 12.5' if respond_to?(:chef_version)

supports 'windows'

# depends 'windows'
# depends 'cb-example'
gem 'vault'
depends 'chef-vault', '< 3'
gem 'chef-vault', '< 4'

Expected Result

Cookbook should be uploaded successfully, or return an error if there is something wrong with dependencies.

Actual Result

All Chef client runs were failing with the following errors:

ERROR: Server returned error 500 for [https://chef.tattsgroup.com/organizations/tattsgroup/environments/<environmentName>/cookbook_versions](https://chef.domain.local/organizations/myOrg/environments/ProductionDC1/cookbook_versions), retrying 1/5 in 4s
...
...

================================================================================
Error Resolving Cookbooks for Run List:
================================================================================

Unknown Server Error:
---------------------
The server had a fatal error attempting to load the node data.

Server Response:
----------------
internal service error

Logs in /var/log/cinc-project/opscode-erchef/crash.log began displaying the following error at the time the cookbook was uploaded:

2023-03-02 11:09:46 =ERROR REPORT==== {<<"method=POST; path=/organizations/tattsgroup/environments/ProductionDC2/cookbook_versions; status=500; ">>,"Internal Server Error" }

Observed high CPU usage from /opt/cinc-project/embedded/service/opscode-erchef/bin/oc_erchef

Observed high CPU usage from 5 instances of the following ruby command: ruby /opt/cinc-project/embedded/service/opscode-erchef/lib/chef_objects-14.15.10/priv/depselector_rb/depselector.rb

After deleting the newly uploaded cookbook version, Chef server CPU usage begins to return to normal over the course of several minutes and Chef client runs are once again able to run without error.

PrajaktaPurohit commented 1 year ago

Would you be able to upload the cookbook that you are seeing this issue with and also a gzip of the log directory that includes the error. That will help us understand a few more things that might be causing this issue.

carlosaya commented 1 year ago

Logs have been uploaded here

The issue began at 2023-03-02 11:09:46 AEST, or 2023-03-02 00:09:46 UTC

carlosaya commented 1 year ago

Here is the cookbook version that caused the issue.

cb_octopus_server-1.0.1.zip

carlosaya commented 1 year ago

Hi, @PrajaktaPurohit I uploaded everything you requested the same day you asked for it, yet this is still marked as Waiting on Contributor. Can you please remove that tag? Thanks.