cloudfoundry / cf-deployment

The canonical open source deployment manifest for Cloud Foundry
Apache License 2.0
294 stars 306 forks source link

App log is not displaying with cf logs #411

Closed ghost closed 6 years ago

ghost commented 6 years ago

Hi, We have deployed the CF using cf-deployment repo. All the VMs are in running state but the application logs is not displaying.

tried to get logs using: cf logs APP_NAME cf logs APP_NAME --recent

cf-deployment manifest version: v1.12.0 cloud: Alibaba

I have attached the screen shot. image

cf-gitbot commented 6 years ago

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/155362073

The labels on this github issue will be updated when the story is started.

dsabeti commented 6 years ago

Hi @lordcf. While you're running cf logs, can you also make an HTTP request (perhaps with curl) against the app? That should force some logs through.

I'm also going to loop in @xiaozhu36, who helps to support cf-deployment on Alibaba Cloud.

ghost commented 6 years ago

Hi @dsabeti . I have curl the application after that too cf logs is not displaying anything. The message will be like this only.

"Retrieving logs for app php-demo in org system / space testing as testuser...

Even while pushing an application also its not displaying the logs. image

I have checked loggregator_trafficcontroller logs. its displaying following error. 2018/02/21 10:07:56 Connecting to 1 dopplers 2018/02/21 10:08:16 error receiving from doppler via gRPC context canceled 2018/02/21 10:08:16 Error while reading from stream (192.168.3.14:8082): rpc error: code = Canceled desc = context canceled 2018/02/21 10:08:16 Disconnecting from stream (192.168.3.14:8082) (doppler.disconnect=false) (ctx.disconnect=1)

dsabeti commented 6 years ago

@lordcf, ok. It looks like the traffic controller can't connect to the dopplers. Can you check that there aren't any firewall rules blocking networking traffic between the two IP addresses?

Let's also get help from the Loggregator team: cc @ahevenor @JohannaSmith

johannaratliff commented 6 years ago

@lordcf Those are pretty standard logs as the trafficcontroller continually reconnects to dopplers. Based on the time gap between the connection log and the gRPC context cancellation, I do believe trafficcontroller is discovering the doppler. I notice that it only connects to 1 doppler. How many doppler instances and trafficcontroller instances do you have? Because we would recommend at least 2 dopplers per AZ to increase log reliability/ chance of it getting through the system.

ghost commented 6 years ago

@JohannaSmith : Thanks for response. We found the issue. Actually there is an metron_agent addons configure in the https://github.com/cloudfoundry/cf-deployment/blob/master/cf-deployment.yml#L10 files. But while deploying the cf. its not applying the by default to all vms (no metron_agent job running on the VMs). We have created the new file with metron_agent.yml which configuration is mention on cf-deployment. After that I have done "update-runtime-config" and then redeployed CF. Now metron_agent job is running on each VM and we are getting the logs.

But we have query, Why menton_agent addons is not applying while first time cf deployment?

dsabeti commented 6 years ago

@lordcf has the right question -- I'd expect the metron_agent addon to be added to each instance group. My guess is that you're using an older BOSH director that doesn't support deployment-level addons.

@lordcf, what version of the bosh director are you using? You can discover it by running bosh env.

ghost commented 6 years ago

@dsabeti we're using Bosh Version 264.7.0

Could you please suggest which version of bosh should I try with ?

dsabeti commented 6 years ago

264.7.0 looks like a sufficiently up-to-date version of the BOSH director that the deployment-level add-on should have worked. What version of the BOSH CLI did you use? Can you also show us your full deploy command?

ghost commented 6 years ago

@dsabeti Bosh CLI: version 2.0.48

Please see the deployment command:

_bosh -e my-bosh -d cf deploy cf-deployment/cf-deployment.yml --vars-store cf-vars.yml -o cf-deployment/iaas-support/alicloud/stemcells.yml -v region=eu-central-1 -v system_domain=$SYSTEM_DOMAIN -v app_domains=$APPDOMAIN

dsabeti commented 6 years ago

This all makes it seem like it should work. The only thing that's unusual is that you're deploying to Alibaba cloud. Let's see if @xiaozhu36 can help out.

xiaozhu36 commented 6 years ago

Hi @dsabeti I have no idea and I also happened this issue.

dsabeti commented 6 years ago

Ok @xiaozhu36. I want to make sure I understand what issue you're facing. You also see that the metron agent isn't colocated on all VMs? If so, then I think as the point of contact for Alibaba cloud, you should reach out to the BOSH team (cc @cppforlife @dpb587) to address the issue.

jyriok commented 6 years ago

Same for me here. i use cf-deployment 1.17-.1.18 on openstack. Addons block doesn't works and i don't have metron_agent on any vms. (and so no log)

bosh version

bosh env
Using environment 'https://10.50.0.8:25555' as client 'bosh'

Name      my-bosh
UUID      f9500cb1-3610-4f94-a9fb-798932d9fc72
Version   261.4.0 (00000000)
CPI       openstack_cpi
Features  compiled_package_cache: disabled
          config_server: disabled
          dns: disabled
          snapshots: disabled
User      bosh

Succeeded

and bosh cli

 bosh --version
version 2.0.45-d208799-2017-10-28T00:31:53

so i've make a ops file for add this on all vms (except smoke-tests) and now all is ok :+1:

# fix metron_agent addons

- type: replace
  path: /instance_groups/name=consul/jobs/name=metron_agent?
  value: &metron_agent-add
    name: metron_agent
    release: loggregator
    properties:
      loggregator:
        tls:
          ca_cert: "((loggregator_ca.certificate))"
          metron:
            cert: "((loggregator_tls_metron.certificate))"
            key: "((loggregator_tls_metron.private_key))"

- type: replace
  path: /instance_groups/name=router/jobs/name=metron_agent?
  value: *metron_agent-add

- type: replace
  path: /instance_groups/name=api/jobs/name=metron_agent?
  value: *metron_agent-add
etc.... 
dsabeti commented 6 years ago

@jyriok, you're using a BOSH director that has this bug in it. You need to upgrade to 264.1 or later.

xiaozhu36 commented 6 years ago

Hi @dsabeti I am using a Bosh director 264.7 and there is also no logs.

cppforlife commented 6 years ago

@xiaozhu36 can you gist output of bosh interpolate wiht ops files applied? (without creds of course).

dsabeti commented 6 years ago

Hi @xiaozhu36 @lordcf, any updates on this issue?

xiaozhu36 commented 6 years ago

HI @lordcf @dsabeti I have found the reason that you need to modify CPI manifest and add ntp to China timezone in alibaba cloud CPI manifest. Please be check it. Thanks a lot.

ashwithahn commented 6 years ago

Hi, I am using bosh version

Name      bosh-bbl-env-reindeer-2018-07-23t00-50z
UUID      d55a654e-7ef2-4847-9990-370ca74e52c9
Version   265.2.0 (00000000)
CPI       aws_cpi
Features  compiled_package_cache: disabled
          config_server: enabled
          dns: disabled
          snapshots: disabled
User      admin

And cf version cf version 6.36.1+e3799ad7e.2018-04-04

when i run cf logs app_name -v for tracing logs as am not getting currently, I see following error

`
WEBSOCKET RESPONSE: [2018-07-25T04:12:17Z]
HTTP/1.1 400 Bad Request
Content-Type: text/plain; charset=utf-8
Date: Wed, 25 Jul 2018 04:12:17 GMT
Sec-Websocket-Version: 13
X-Content-Type-Options: nosniff
X-Vcap-Request-Id: 6733ad60-f50d-45b1-7a8d-774e7d480ac6
Content-Length: 12
Connection: keep-alive

WEBSOCKET ERROR: [2018-07-25T04:12:17Z]
Error dialing trafficcontroller server: websocket: bad handshake.
Please ask your Cloud Foundry Operator to check the platform configuration (trafficcontroller is wss://doppler.app-cloudfoundry.com:443).. Retrying...
`

Please help me resolve this, Thanks in Advance

ghost commented 6 years ago

@ashwithahn Is this issue resolved ?

ashwithahn commented 6 years ago

@lordcf : Yes its resolved.

ghost commented 6 years ago

@ashwithahn We are getting below error for doppler. 2018/08/20 11:08:03 Connecting to 1 dopplers 2018/08/20 11:10:17 Connecting to 1 dopplers 2018/08/20 11:11:38 error receiving from doppler via gRPC context canceled 2018/08/20 11:11:38 Error while reading from stream (10.1.18.106:8082): rpc error: code = Canceled desc = context canceled 2018/08/20 11:11:38 Disconnecting from stream (10.1.18.106:8082) (doppler.disconnect=false) (ctx.disconnect=1) 2018/08/20 11:11:38 error receiving from doppler via gRPC context canceled

CF Push- cf push DUMMYWEB Updating app DUMMYWEB in org Smoke-test-Org / space Smoke-test-Space as admin... OK

Uploading DUMMYWEB... Uploading app files from: C:\Users\A663370\AppData\Local\Temp\unzipped-app232592067 Uploading 1.5K, 7 files Done uploading OK

Starting app DUMMYWEB in org Smoke-test-Org / space Smoke-test-Space as admin... There is no logs here. Error- Error restarting application: DUMMYWEB failed to stage within 15.000000 minutes

Command- bosh2 -e bosh2-bosh -d aws-clients-acf-devtest-cf-bosh2 deploy cf-deployment.yml -v system_domain=sys.eutest.cfdev.canopy-cloud.com --vars-store=transition/deployment-vars.yml -o operations/scale-to-one-az.yml -o operations/aws.yml -o operations/use-external-dbs.yml -l operations/example-vars-files/vars-use-external-dbs-new.yml -o operations/legacy/keep-static-ips.yml -o transition/keep-etcd-for-transition.yml -o transition/remove-cf-networking-for-transition.yml -o operations/override-app-domains.yml -l operations/example-vars-files/vars-override-app-domains.yml -o operations/use-external-blobstore.yml -o operations/use-s3-blobstore.yml -l operations/example-vars-files/vars-use-s3-blobstore.yml -o operations/set-bbs-active-key.yml -o transition/cfr-to-cfd.yml

Q. How to deploy aws.yml separately. Could you please share some details.

ashwithahn commented 6 years ago

@lordcf : I had come across same issue of "error receiving from doppler via gRPC context canceled". Could you install Nozzle plugin and check the output of "cf nozzle --debug"