cloudfoundry / cf-deployment

The canonical open source deployment manifest for Cloud Foundry
Apache License 2.0
295 stars 305 forks source link

dial tcp: i/o timeout error while pushing the apps #401

Closed Deekshit51 closed 6 years ago

Deekshit51 commented 6 years ago

Hi ,

I have done the cf-deployment using bosh-lite , but i am getting the below error while running the cf push command, please help me to resolve this.

cf version : 6.3 bosh version : 2+ OS : ubuntu Note: I am using modem for data connection

/workspace/cf-helloworld$ sudo  cf push 
Pushing from manifest to org cloudfoundry / space development as admin...
Using manifest file /home/krishna/workspace/cf-helloworld/manifest.yml
Getting app info...
Updating app with these attributes...
  name:                cf-HelloWorld
  path:                /home/krishna/workspace/cf-helloworld
  buildpack:           python_buildpack
  disk quota:          1G
  health check type:   port
  instances:           1
  memory:              64M
  stack:               cflinuxfs2
  routes:
    cf-helloworld.bosh-lite.com

Updating app cf-HelloWorld...
Mapping routes...
Comparing local files to remote cache...
Packaging files to upload...
Uploading files...
 2.65 KiB / 2.65 KiB [==========================================================================================================================================================================] 100.00% 1s

Waiting for API to complete processing files...

Staging app and tracing logs...
   -----> Python Buildpack version 1.6.8
   -----> Supplying Python
   -----> Installing python 3.5.3
          Download [https://buildpacks.cloudfoundry.org/dependencies/python/python-3.5.3-linux-x64-9339f9ad.tgz]
          ****ERROR** Could not install python: Get https://buildpacks.cloudfoundry.org/dependencies/python/python-3.5.3-linux-x64-9339f9ad.tgz: dial tcp: i/o timeout
   Failed to compile droplet: Failed to run all supply scripts: exit status 14
   Exit status 223
   Stopping instance ce6f7f93-d71d-441e-9529-bc43533200e7**
   Destroying container
   Successfully destroyed container
Error staging application: App staging failed in the buildpack compile phase

screenshot from 2018-02-11 14-22-37

cf-gitbot commented 6 years ago

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/155109274

The labels on this github issue will be updated when the story is started.

Deekshit51 commented 6 years ago

Hi, Adding some more info..Please help me on this quickly.

I have followed the below installation process for deploying the cloud foundry using bosh.. please let me know if i need to do anything additionally.

https://banck.net/2017/03/deploying-cloud-foundry-virtualbox-using-bosh-cli-v2/

Thanks in advance, Deekshit.

nimakaviani commented 6 years ago

Hi @Deekshit51

It appears that the cell doing the staging task timed out downloading python. Given that it is boshlite, I would assume that you only have one cell, right?

I would be curious to know if your boshlite can resolve domain names correctly. What happens if you ssh into the cell and try to dig https://buildpacks.cloudfoundry.org? We have seen cases where boshlite cells have not been able to resolve domain names properly and thus timeouts occuring.

If domain names can be resolved, can you time downloading the given python package on the cell? Diego cells have 10minute idle timeout when downloading assets to the cells and it might be that you are hitting the timeout.

thanks nima

Deekshit51 commented 6 years ago

Hi @nimakaviani ,

I am trying to do SSH into the cell , but i am getting the below error.Can you please assist on this ?

screenshot from 2018-02-13 10-36-32

Note: Yes , I am using only one cell.

Thanks, Deekshit

dsabeti commented 6 years ago

Hey @Deekshit51. This a pretty common issue when you're trying to ssh onto VMs.

At some point in the past, you ssh'ed onto a VM at the same IP address (192.168.50.6) and received its ssh fingerprint. The local ssh client saves that fingerprint to ~/.ssh/known_hosts, so that in the future, it can use that fingerprint to verify the identity of the server you're trying to ssh to.

If, however, you recreate the VM at that IP address (in this case, the jumpbox), the VM generates a new RSA key, with a new fingerprint. Your local ssh client notices that there is a different fingerprint and fails.

The fix is pretty simple: remove the original fingerprint from ~/.ssh/known_hosts.

Deekshit51 commented 6 years ago

Hi @dsabeti ,

Thanks for the detailed mail, now i am able to ssh to the vm.

but when i am running "dig https://buildpacks.cloudfoundry.org" ... i am getting connection timedout error.

check the below screenshot and kindly assisst me.

screenshot from 2018-02-13 12-24-33

dsabeti commented 6 years ago

@Deekshit51, this looks like an environment-specific issue -- either to do with networking or DNS configuration. You could start by looking in /etc/resolv.conf -- that's where you configure your machine with the IP addresses of DNS servers. Make sure that you can ping those IP addresses. If you can't, you might need to reconfigure that. Another thing to consider is your networking config -- do you have any firewall rules that prevent traffic to public networks?

Deekshit51 commented 6 years ago

Hi @dsabeti ,

The content present in /etc/resolve.conf is below

# Dynamic resolv.conf(5) file for glibc resolver(3) generated resolvconf(8) nameserver 127.0.1.1

Let me know if anything needs to be added...

And there are no firewall restrictions on my machine.

Thanks, Deekshit

dsabeti commented 6 years ago

Hi @Deekshit51, it looks like you'll need to debug your DNS configuration. 127.0.1.1 is just your loopback device, which won't resolve DNS entries as far as I know.

When I looked at my resolv.conf (bosh-lite deployed to GCP), it had the following in its resolv.confg:

nameserver 127.0.0.1
nameserver 8.8.8.8
nameserver 169.254.169.254

The second entry, 8.8.8.8, is Google's DNS.

What steps did you take to create your bosh-lite?

Deekshit51 commented 6 years ago

Hi @dsabeti ,

I have followed the steps in the below link, please help me if i missed anything on this as i need to deploy cloudfoundry ASAP.

https://banck.net/2017/03/deploying-cloud-foundry-virtualbox-using-bosh-cli-v2/

Thanks in advance, Deekshit

dsabeti commented 6 years ago

Hi @Deekshit51. I'm pretty sure that your virtual box inherits its resolv.conf from your local machine, and the bosh-lite containers inherit the resolv.conf from the virtual box host. Try updating the resolv.conf on your virtual box host to include

nameserver 8.8.8.8

and recreating your deployment. If that works, then I think you should also update your machine's /etc/resolv.conf as well, so that if you ever recreate the virtualbox host, you'll get the proper DNS configuration.

Deekshit51 commented 6 years ago

Hi,

After Providing the below permissions to the ping after ssh.. into vm, and i am able to access the network from vm , but still unable to push the apps.. may i know the solution ?

chmod 4755 /bin/ping ls -ltr /bin/ping

Error While deploying the apps :

Pushing from manifest to org cloudfoundry / space development as admin... Using manifest file /home/zeranraza/workspace/cf-helloworld/manifest.yml Getting app info... Updating app with these attributes... name: cf-elloWorld path: /home/zeranraza/workspace/cf-helloworld buildpack: python_buildpack disk quota: 1G health check type: port instances: 1 memory: 64M stack: cflinuxfs2 routes: cf-elloworld.bosh-lite.com

Updating app cf-elloWorld... Mapping routes... Comparing local files to remote cache... Packaging files to upload... Uploading files... 2.65 KiB / 2.65 KiB [==========================================================================================================================================================================] 100.00% 1s

Waiting for API to complete processing files...

Staging app and tracing logs... Downloading python_buildpack... Downloaded python_buildpack Creating container Successfully created container Downloading app package... Downloaded app package (2.6K) -----> Python Buildpack version 1.6.9 -----> Supplying Python -----> Installing python 2.7.13 Download [https://buildpacks.cloudfoundry.org/dependencies/python/python-2.7.13-linux-x64-c2433d9a.tgz] ERROR Could not install python: Get https://buildpacks.cloudfoundry.org/dependencies/python/python-2.7.13-linux-x64-c2433d9a.tgz: dial tcp: lookup buildpacks.cloudfoundry.org on 10.0.2.1:53: read udp 10.255.205.123:36131->10.0.2.1:53: i/o timeout Failed to compile droplet: Failed to run all supply scripts: exit status 14 Exit status 223 Stopping instance 7aab44fb-e3bb-4ba0-8e3f-ce480cc76fba Destroying container Error staging application: App staging failed in the buildpack compile phase FAILED

dsabeti commented 6 years ago

@Deekshit51, it looks like you're still experiencing the same issue. Did you try to update your resolv.conf?

Deekshit51 commented 6 years ago

Hi , yes i have updated that file but still couldn't able to resolve the issue..

Need your help @dsabeti , been trying this from long time.

Thanks in advance, Deekshit

dsabeti commented 6 years ago

Ok, if you updated the resolv.conf on your computer, I'd expect the resolv.conf in your VirtualBox host to also include those changes. What are the contents of /etc/resolv.conf in your VirtualBox host? In your Diego cell?

Honestly, you may have better luck trying to deploy to a proper IaaS. If you want to try that route, you can follow this doc: https://github.com/cloudfoundry/cf-deployment/blob/master/iaas-support/bosh-lite/README.md

Deekshit51 commented 6 years ago

Hi @dsabeti ,

Now this issue is resolved, we had reset the complete network settings and tried deploying the application using virtualbox 5.2 version, and it is working fine.

Thanks for your continuous support and will let you know if we face any issues further.

Sorry if any inconvenience.

Regards, Deekshit.