Closed ghost closed 6 years ago
We have created an issue in Pivotal Tracker to manage this:
https://www.pivotaltracker.com/story/show/155362073
The labels on this github issue will be updated when the story is started.
Hi @lordcf. While you're running cf logs
, can you also make an HTTP request (perhaps with curl
) against the app? That should force some logs through.
I'm also going to loop in @xiaozhu36, who helps to support cf-deployment on Alibaba Cloud.
Hi @dsabeti . I have curl the application after that too cf logs is not displaying anything. The message will be like this only.
"Retrieving logs for app php-demo in org system / space testing as testuser...
Even while pushing an application also its not displaying the logs.
I have checked loggregator_trafficcontroller logs. its displaying following error. 2018/02/21 10:07:56 Connecting to 1 dopplers 2018/02/21 10:08:16 error receiving from doppler via gRPC context canceled 2018/02/21 10:08:16 Error while reading from stream (192.168.3.14:8082): rpc error: code = Canceled desc = context canceled 2018/02/21 10:08:16 Disconnecting from stream (192.168.3.14:8082) (doppler.disconnect=false) (ctx.disconnect=1)
@lordcf, ok. It looks like the traffic controller can't connect to the dopplers. Can you check that there aren't any firewall rules blocking networking traffic between the two IP addresses?
Let's also get help from the Loggregator team: cc @ahevenor @JohannaSmith
@lordcf Those are pretty standard logs as the trafficcontroller continually reconnects to dopplers. Based on the time gap between the connection log and the gRPC context cancellation, I do believe trafficcontroller is discovering the doppler. I notice that it only connects to 1 doppler. How many doppler instances and trafficcontroller instances do you have? Because we would recommend at least 2 dopplers per AZ to increase log reliability/ chance of it getting through the system.
@JohannaSmith : Thanks for response. We found the issue. Actually there is an metron_agent addons configure in the https://github.com/cloudfoundry/cf-deployment/blob/master/cf-deployment.yml#L10 files. But while deploying the cf. its not applying the by default to all vms (no metron_agent job running on the VMs). We have created the new file with metron_agent.yml which configuration is mention on cf-deployment. After that I have done "update-runtime-config" and then redeployed CF. Now metron_agent job is running on each VM and we are getting the logs.
But we have query, Why menton_agent addons is not applying while first time cf deployment?
@lordcf has the right question -- I'd expect the metron_agent addon to be added to each instance group. My guess is that you're using an older BOSH director that doesn't support deployment-level addons.
@lordcf, what version of the bosh director are you using? You can discover it by running bosh env
.
@dsabeti we're using Bosh Version 264.7.0
Could you please suggest which version of bosh should I try with ?
264.7.0 looks like a sufficiently up-to-date version of the BOSH director that the deployment-level add-on should have worked. What version of the BOSH CLI did you use? Can you also show us your full deploy command?
@dsabeti Bosh CLI: version 2.0.48
Please see the deployment command:
_bosh -e my-bosh -d cf deploy cf-deployment/cf-deployment.yml --vars-store cf-vars.yml -o cf-deployment/iaas-support/alicloud/stemcells.yml -v region=eu-central-1 -v system_domain=$SYSTEM_DOMAIN -v app_domains=$APPDOMAIN
This all makes it seem like it should work. The only thing that's unusual is that you're deploying to Alibaba cloud. Let's see if @xiaozhu36 can help out.
Hi @dsabeti I have no idea and I also happened this issue.
Ok @xiaozhu36. I want to make sure I understand what issue you're facing. You also see that the metron agent isn't colocated on all VMs? If so, then I think as the point of contact for Alibaba cloud, you should reach out to the BOSH team (cc @cppforlife @dpb587) to address the issue.
Same for me here. i use cf-deployment 1.17-.1.18 on openstack. Addons block doesn't works and i don't have metron_agent on any vms. (and so no log)
bosh version
bosh env
Using environment 'https://10.50.0.8:25555' as client 'bosh'
Name my-bosh
UUID f9500cb1-3610-4f94-a9fb-798932d9fc72
Version 261.4.0 (00000000)
CPI openstack_cpi
Features compiled_package_cache: disabled
config_server: disabled
dns: disabled
snapshots: disabled
User bosh
Succeeded
and bosh cli
bosh --version
version 2.0.45-d208799-2017-10-28T00:31:53
so i've make a ops file for add this on all vms (except smoke-tests) and now all is ok :+1:
# fix metron_agent addons
- type: replace
path: /instance_groups/name=consul/jobs/name=metron_agent?
value: &metron_agent-add
name: metron_agent
release: loggregator
properties:
loggregator:
tls:
ca_cert: "((loggregator_ca.certificate))"
metron:
cert: "((loggregator_tls_metron.certificate))"
key: "((loggregator_tls_metron.private_key))"
- type: replace
path: /instance_groups/name=router/jobs/name=metron_agent?
value: *metron_agent-add
- type: replace
path: /instance_groups/name=api/jobs/name=metron_agent?
value: *metron_agent-add
etc....
@jyriok, you're using a BOSH director that has this bug in it. You need to upgrade to 264.1 or later.
Hi @dsabeti I am using a Bosh director 264.7 and there is also no logs.
@xiaozhu36 can you gist output of bosh interpolate wiht ops files applied? (without creds of course).
Hi @xiaozhu36 @lordcf, any updates on this issue?
HI @lordcf @dsabeti I have found the reason that you need to modify CPI manifest and add ntp to China timezone in alibaba cloud CPI manifest. Please be check it. Thanks a lot.
Hi, I am using bosh version
Name bosh-bbl-env-reindeer-2018-07-23t00-50z
UUID d55a654e-7ef2-4847-9990-370ca74e52c9
Version 265.2.0 (00000000)
CPI aws_cpi
Features compiled_package_cache: disabled
config_server: enabled
dns: disabled
snapshots: disabled
User admin
And cf version
cf version 6.36.1+e3799ad7e.2018-04-04
when i run cf logs app_name -v for tracing logs as am not getting currently, I see following error
`
WEBSOCKET RESPONSE: [2018-07-25T04:12:17Z]
HTTP/1.1 400 Bad Request
Content-Type: text/plain; charset=utf-8
Date: Wed, 25 Jul 2018 04:12:17 GMT
Sec-Websocket-Version: 13
X-Content-Type-Options: nosniff
X-Vcap-Request-Id: 6733ad60-f50d-45b1-7a8d-774e7d480ac6
Content-Length: 12
Connection: keep-alive
WEBSOCKET ERROR: [2018-07-25T04:12:17Z]
Error dialing trafficcontroller server: websocket: bad handshake.
Please ask your Cloud Foundry Operator to check the platform configuration (trafficcontroller is wss://doppler.app-cloudfoundry.com:443).. Retrying...
`
Please help me resolve this, Thanks in Advance
@ashwithahn Is this issue resolved ?
@lordcf : Yes its resolved.
@ashwithahn We are getting below error for doppler. 2018/08/20 11:08:03 Connecting to 1 dopplers 2018/08/20 11:10:17 Connecting to 1 dopplers 2018/08/20 11:11:38 error receiving from doppler via gRPC context canceled 2018/08/20 11:11:38 Error while reading from stream (10.1.18.106:8082): rpc error: code = Canceled desc = context canceled 2018/08/20 11:11:38 Disconnecting from stream (10.1.18.106:8082) (doppler.disconnect=false) (ctx.disconnect=1) 2018/08/20 11:11:38 error receiving from doppler via gRPC context canceled
CF Push- cf push DUMMYWEB Updating app DUMMYWEB in org Smoke-test-Org / space Smoke-test-Space as admin... OK
Uploading DUMMYWEB... Uploading app files from: C:\Users\A663370\AppData\Local\Temp\unzipped-app232592067 Uploading 1.5K, 7 files Done uploading OK
Starting app DUMMYWEB in org Smoke-test-Org / space Smoke-test-Space as admin... There is no logs here. Error- Error restarting application: DUMMYWEB failed to stage within 15.000000 minutes
Command- bosh2 -e bosh2-bosh -d aws-clients-acf-devtest-cf-bosh2 deploy cf-deployment.yml -v system_domain=sys.eutest.cfdev.canopy-cloud.com --vars-store=transition/deployment-vars.yml -o operations/scale-to-one-az.yml -o operations/aws.yml -o operations/use-external-dbs.yml -l operations/example-vars-files/vars-use-external-dbs-new.yml -o operations/legacy/keep-static-ips.yml -o transition/keep-etcd-for-transition.yml -o transition/remove-cf-networking-for-transition.yml -o operations/override-app-domains.yml -l operations/example-vars-files/vars-override-app-domains.yml -o operations/use-external-blobstore.yml -o operations/use-s3-blobstore.yml -l operations/example-vars-files/vars-use-s3-blobstore.yml -o operations/set-bbs-active-key.yml -o transition/cfr-to-cfd.yml
Q. How to deploy aws.yml separately. Could you please share some details.
@lordcf : I had come across same issue of "error receiving from doppler via gRPC context canceled". Could you install Nozzle plugin and check the output of "cf nozzle --debug"
Hi, We have deployed the CF using cf-deployment repo. All the VMs are in running state but the application logs is not displaying.
tried to get logs using: cf logs APP_NAME cf logs APP_NAME --recent
cf-deployment manifest version: v1.12.0 cloud: Alibaba
I have attached the screen shot.