chef / chef-server

Chef Infra Server is a hub for configuration data; storing cookbooks, node policies and metadata of managed nodes.
https://www.chef.io/chef/
Apache License 2.0
289 stars 210 forks source link

Connection refused http://127.0.0.1:10010/reports/nodes/host.example.com/runs #213

Closed ghost closed 2 years ago

ghost commented 9 years ago

Hi,

I've installed opscode-reporting a while ago but it never worked (for me). Today I decided to uninstall this add-on with opscode-reporting-ctl uninstall, chef-server-ctl reconfigure chef-server-ctl restart and finally yum erase opscode-reporting.

I saw that some files where remaining to I deleted them: rm -Rf /var/log/opscode-reporting, rm -Rf /opt/opscode-reporting, rm -Rf /opt/opscode-reporting, `/usr/local/sbin/refresh_reporting_matviews

Since then starting any chef-client is delayed and logs on client side:

[2015-05-03T18:19:38+02:00] ERROR: Server returned error 502 for https://chef-server.example.com/reports/nodes/host.example.com/runs, retrying 1/5 in 4s
[2015-05-03T18:19:42+02:00] ERROR: Server returned error 502 for https://chef-server.example.com/reports/nodes/host.example.com/runs, retrying 2/5 in 6s
[2015-05-03T18:19:48+02:00] ERROR: Server returned error 502 for https://chef-server.example.com/reports/nodes/host.example.com/runs, retrying 3/5 in 15s
[2015-05-03T18:20:03+02:00] ERROR: Server returned error 502 for https://chef-server.example.com/reports/nodes/host.example.com/runs, retrying 4/5 in 29s
[2015-05-03T18:20:32+02:00] ERROR: Server returned error 502 for https://chef-server.example.com/reports/nodes/host.example.com/runs, retrying 5/5 in 38s

On server side I'm seeing this:

2015/05/03 18:19:37 [error] 14350#0: *12563 connect() failed (111: Connection refused) while connecting to upstream, client: 1.2.3.4, server: chef-server.example.com, request: "POST /reports/nodes/host.example.com/runs HTTP/1.1", upstream: "http://127.0.0.1:10010/reports/nodes/host.example.com/runs", host: "chef-server.example.com:443"

Indeed there's no port 10010 listening on the chef server. Here're my service list, configuration and port bindings:

bookshelf*
nginx*
oc_bifrost*
oc_id*
opscode-chef-mover*
opscode-erchef*
opscode-expander*
opscode-expander-reindexer*
opscode-solr4*
postgresql*
rabbitmq*
redis_lb*
Starting Chef Client, version 12.0.3
resolving cookbooks for run list: ["private-chef::show_config"]
Synchronizing Cookbooks:
  - private-chef
  - enterprise
  - apt
  - yum
  - runit
  - build-essential
  - yum-epel
Compiling Cookbooks...
{
  "private_chef": {
    "opscode-chef": {

    },
    "redis_lb": {
      "log_rotation": {

      }
    },
    "addons": {
      "install": false
    },
    "rabbitmq": {
      "log_rotation": {

      },
      "password": "aaaaaabbbbbccccddddeeee",
      "jobs_password": "aaaaaabbbbbccccddddeeee",
      "actions_password": "aaaaaabbbbbccccddddeeee"
    },
    "opscode-solr4": {
      "log_rotation": {

      }
    },
    "opscode-expander": {
      "log_rotation": {

      }
    },
    "opscode-erchef": {
      "log_rotation": {

      },
      "max_request_size": null
    },
    "oc_chef_authz": {

    },
    "folsom-graphite": {

    },
    "lb": {
      "xdl_defaults": {

      },
      "api_fqdn": "chef-server.example.com",
      "web_ui_fqdn": "chef-server.example.com"
    },
    "lb-internal": {

    },
    "postgresql": {
      "log_rotation": {

      },
      "sql_password": "aaaaaabbbbbccccddddeeee",
      "sql_ro_password": "aaaaaabbbbbccccddddeeee"
    },
    "oc_bifrost": {
      "log_rotation": {

      },
      "superuser_id": "aaaaaabbbbbccccddddeeee",
      "sql_password": "aaaaaabbbbbccccddddeeee",
      "sql_ro_password": "aaaaaabbbbbccccddddeeee"
    },
    "oc_id": {
      "log_rotation": {

      },
      "sql_password": "aaaaaabbbbbccccddddeeee",
      "secret_key_base": "aaaaaabbbbbccccddddeeee"
    },
    "opscode-chef-mover": {

    },
    "bookshelf": {
      "log_rotation": {

      },
      "access_key_id": "aaaaaabbbbbccccddddeeee",
      "secret_access_key": "aaaaaabbbbbccccddddeeee"
    },
    "bootstrap": {

    },
    "drbd": {
      "shared_secret": "aaaaaabbbbbccccddddeeee"
    },
    "keepalived": {
      "vrrp_instance_password": "aaaaaabbbbbccccddddeeee"
    },
    "estatsd": {

    },
    "nginx": {
      "log_rotation": {

      },
      "ssl_certificate": "/etc/pki/tls/certs/chef-server.example.com.crt",
      "ssl_certificate_key": "/etc/pki/tls/private/chef-server.example.com.key",
      "enable_ipv6": false,
      "server_name": "chef-server.example.com",
      "url": "https://chef-server.example.com"
    },
    "ldap": {

    },
    "user": {

    },
    "ha": {

    },
    "disabled-plugins": [

    ],
    "enabled-plugins": [

    ],
    "license": {

    },
    "couchdb": {

    },
    "opscode-solr": {

    },
    "default_orgname": "example",
    "oc-chef-pedant": {

    },
    "notification_email": null,
    "from_email": null,
    "role": null,
    "topology": "standalone",
    "servers": {

    },
    "backend_vips": {

    },
    "logs": {
      "log_retention": {

      },
      "log_rotation": {

      }
    },
    "dark_launch": {

    },
    "folsom_graphite": {

    }
  }
}
Converging 0 resources

Running handlers:
Running handlers complete
Chef Client finished, 0/0 resources updated in 2.261358008 seconds
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       User       Inode      PID/Program name   
tcp        0      0 127.0.0.1:8000              0.0.0.0:*                   LISTEN      496        89132      12473/beam.smp      
tcp        0      0 127.0.0.1:4321              0.0.0.0:*                   LISTEN      496        86342      12298/beam.smp      
tcp        0      0 0.0.0.0:9090                0.0.0.0:*                   LISTEN      496        87371      12411/rails master  
tcp        0      0 0.0.0.0:35429               0.0.0.0:*                   LISTEN      496        86426      12365/beam.smp      
tcp        0      0 127.0.0.1:5672              0.0.0.0:*                   LISTEN      496        88350      12549/beam.smp      
tcp        0      0 0.0.0.0:25672               0.0.0.0:*                   LISTEN      496        87681      12549/beam.smp      
tcp        0      0 0.0.0.0:60298               0.0.0.0:*                   LISTEN      496        86328      12298/beam.smp      
tcp        0      0 0.0.0.0:111                 0.0.0.0:*                   LISTEN      0          16881      2197/rpcbind        
tcp        0      0 0.0.0.0:9680                0.0.0.0:*                   LISTEN      0          109743     14341/nginx         
tcp        0      0 0.0.0.0:80                  0.0.0.0:*                   LISTEN      0          109741     14341/nginx         
tcp        0      0 0.0.0.0:4369                0.0.0.0:*                   LISTEN      496        13455      1265/epmd           
tcp        0      0 0.0.0.0:9683                0.0.0.0:*                   LISTEN      0          109744     14341/nginx         
tcp        0      0 0.0.0.0:40212               0.0.0.0:*                   LISTEN      29         17147      2272/rpc.statd      
tcp        0      0 5.9.190.177:53              0.0.0.0:*                   LISTEN      0          19306      2688/dnsmasq        
tcp        0      0 0.0.0.0:53494               0.0.0.0:*                   LISTEN      496        88599      12473/beam.smp      
tcp        0      0 0.0.0.0:22                  0.0.0.0:*                   LISTEN      0          17829      2438/sshd           
tcp        0      0 127.0.0.1:9462              0.0.0.0:*                   LISTEN      496        15047      1258/unicorn master 
tcp        0      0 127.0.0.1:9463              0.0.0.0:*                   LISTEN      496        87232      12365/beam.smp      
tcp        0      0 127.0.0.1:5432              0.0.0.0:*                   LISTEN      497        87450      12522/postgres      
tcp        0      0 127.0.0.1:25                0.0.0.0:*                   LISTEN      0          18358      2533/master         
tcp        0      0 127.0.0.1:11001             0.0.0.0:*                   LISTEN      496        16954      1281/ruby           
tcp        0      0 0.0.0.0:11002               0.0.0.0:*                   LISTEN      496        13587      1260/redis-server * 
tcp        0      0 127.0.0.1:16379             0.0.0.0:*                   LISTEN      496        349569     10850/redis-server  
tcp        0      0 0.0.0.0:443                 0.0.0.0:*                   LISTEN      0          109742     14341/nginx         
tcp        0      0 :::873                      :::*                        LISTEN      0          17938      2447/xinetd         
tcp        0      0 :::39529                    :::*                        LISTEN      29         17155      2272/rpc.statd      
tcp        0      0 :::111                      :::*                        LISTEN      0          16886      2197/rpcbind        
tcp        0      0 2a01:4f8:162:12af:::53      :::*                        LISTEN      0          19266      2688/dnsmasq        
tcp        0      0 :::22                       :::*                        LISTEN      0          17835      2438/sshd           
tcp        0      0 ::ffff:127.0.0.1:8983       :::*                        LISTEN      496        88592      12501/java          
tcp        0      0 ::1:5432                    :::*                        LISTEN      497        87449      12522/postgres      
tcp        0      0 ::1:25                      :::*                        LISTEN      0          18360      2533/master         
tcp        0      0 :::11002                    :::*                        LISTEN      496        13480      1260/redis-server * 

My chef server is running on:

CentOS release 6.6 (Final)
Linux  2.6.32-504.16.2.el6.x86_64

# rpm -qa | grep chef
chef-12.3.0-1.el6.x86_64
chef-server-core-12.0.8-1.el6.x86_64

# rpm -qa | grep opscode
opscode-manage-1.12.0-1.el5.x86_64

The clients are mixed from CentOS 5 to CentOS 7 with different versions of Chef 12 - but they're all experiencing the same issue.

I also tried reinstalling opscode-reporting - but this did not solve the issue.

marcparadise commented 9 years ago

It sounds like there is still a route in place for reporting.

You'd mentioned you tried opscode-manage uninstall - was this the command used? I think opscode-reporting-ctl uninstall is what would have been needed, optionally followed by 'opscode-reporting-ctl cleanse' to purge all reporting data

I think what's happening is that the nginx upstreams are still defined even though the service is not removed. If you take a look in /var/opt/opscode/nginx/etc/nginx.dyou should see some files matching '20-reporting_*.conf'. Those are providing the routes and upstreams for the reporting service.

If you delete them and chef-server-ctl restart nginx that should cause reporting requests to 404, which chef-client expects if reporting is not installed.

ghost commented 9 years ago

Hi,

of course I used opscode-reporting-ctl uninstall

The directory you told me was empty - but I found another one:

/var/opt/opscode/nginx/etc/addon.d
[root@inf-c23fef1b addon.d]# ls -al
total 44
drwxr-x---. 2 opscode opscode 4096 May  3 12:58 .
drwxr-x---. 5 opscode opscode 4096 May  4 20:44 ..
-rw-r--r--. 1 root    root     300 May  3 12:58 20-reporting_external.conf
-rw-r--r--. 1 root    root     539 May  3 12:58 20-reporting_internal.conf
-rw-r--r--. 1 root    root      96 May  3 12:58 20-reporting_upstreams.conf
-rw-r--r--. 1 root    root     466 Dec 20 14:37 30-opscode-manage_external.conf
-rw-r--r--. 1 root    root     239 Dec 20 14:37 30-opscode-manage_internal.conf
-rw-r--r--. 1 root    root     188 Dec 20 14:37 30-opscode-manage_upstreams.conf
-rw-r--r--. 1 root    root     122 Nov 30 21:09 40-oc_id_external.conf
-rw-r--r--. 1 root    root      90 Nov 30 21:09 40-oc_id_upstreams.conf
-rw-r--r--. 1 root    root     754 Nov 30 21:09 README.md

Killing those files there solved my issue.

danielsand commented 9 years ago

+1 reporting module just broke yesterday - just switched the old SSL cert for a new one. uninstalling opscode-reporting-ctl & cleansing - reconfigure - restart didnt helped. removing 20-reporting*.conf file from /var/opt/opscode/nginx/etc/addon.d - did the trick.

lucky-sideburn commented 9 years ago

Hi Guys..

I had the same problem and I fixed in this way.

Reason: Authentication problem with RabbitMQ. I saw the error in /var/opt/opscode/rabbitmq/log/rabbit\@localhost.log

Steps:

  1. Created new user in RabbitMQ

ln -s /opt/opscode/embedded/bin/erl /usr/bin/erl

/opt/opscode/embedded/bin/chpst -u opscode -U opscode /opt/opscode/embedded/bin/rabbitmqctl add_user chef2 mypassword

  1. Configured permissions for chef2 to a specific virtual host

/opt/opscode/embedded/bin/chpst -u opscode -U opscode /opt/opscode/embedded/bin/rabbitmqctl setpermissions -p /analytics chef2 "." "._" ".*"

3.Changed username and password for RabbitMQ into the following file. /var/opt/opscode-reporting/opscode-reporting/etc/sys.config

  1. Restarted services

For me this worked but I will lose my changed after a chef-server-ctl/opscode-reporting-ctl reconfigure. I suggest to override Chef attribute in chef-server.rb

integrii commented 8 years ago

I just had the exact same thing occur. I was getting 502 errors on my clients and the server (whose logs are absolutely horrible to tail with chef-server-ctl tail) and Connection refused while connecting to upstream. I was also getting a few random solr errors I think.

Simply running rm -f /var/opt/opscode/nginx/etc/addon.d/20-reporting*; chef-server-ctl restart; worked for me.

shatil commented 7 years ago

I had to delete those *.conf files and then run chef-manage-ctl reconfigure before the web server would work again.

PrajaktaPurohit commented 4 years ago

Create a cleanup reporting if unused task as a fallout action item from this. Reporting is currently EOL.

marcparadise commented 2 years ago

oc-reporting is EOL