vmware-archive / pcfdev

This is the depricated version of PCF Dev - please visit the current Github repository https://github.com/cloudfoundry-incubator/cfdev for the latest updates
Apache License 2.0
272 stars 67 forks source link

Error: failed to provision VM: Process exited with: 1. Reason was: (). #191

Closed ksatyajitpatra closed 7 years ago

ksatyajitpatra commented 7 years ago

OS: Windows 7 Command line: Powershell I am getting an error "Error: failed to provision VM: Process exited with: 1. Reason was: ()." each time I try to start PCF DEV with SCS option.

C:\Users\satyajit.patra> cf dev start -s scs Using existing image. Allocating 8123 MB out of 16247 MB total system memory (9894 MB free). Importing VM... Starting VM... Provisioning VM... Waiting for services to start... 8 out of 53 running 8 out of 53 running 8 out of 53 running 53 out of 53 running ←[31;1mFAILED←[0m Error: failed to provision VM: Process exited with: 1. Reason was: ().

cf-gitbot commented 7 years ago

We have created an issue in Pivotal Tracker to manage this. Unfortunately, the Pivotal Tracker project is private so you may be unable to view the contents of the story.

The labels on this github issue will be updated when the story is started.

ksatyajitpatra commented 7 years ago

I started with "all" option and it worked. i.e. cf dev start -s all. If anyone has the reasoning behind it then please share. I feel as SCS needs rabbitmq and I have not specified that, it is failing. Please confirm.

sclevine commented 7 years ago

@mdelillo could this be a regression? RabbitMQ used to start with -s scs.

mdelillo commented 7 years ago

@ksatyajitpatra I just started PCF Dev v0.23.0 locally with SCS and it was successful (also stared rabbitmq). Do you still have the failing VM around? If so, can you look at the end of /var/pcfdev/provision.log to see if there is an error?

cf dev ssh
tail -n 100 /var/pcfdev/provision.log
ksatyajitpatra commented 7 years ago

I have destroyed the failing vm. If I ran into the same issue again then I will do this step. Will the path remain same on an window's machine?

mdelillo commented 7 years ago

Yes, that path is inside of the PCF Dev VM which is linux.

sclevine commented 7 years ago

Closing this for now. Please let us know if you experience this issue again.

ynojima commented 7 years ago

Hi,

I encountered same issue with my pcf-dev on Ubuntu 16.04 machine.

ynojima@feedback:~$ cf dev start
Using existing image.
Allocating 3072 MB out of 5967 MB total system memory (5812 MB free).
Importing VM...
Starting VM...
Provisioning VM...
Waiting for services to start...
8 out of 57 running
8 out of 57 running
8 out of 57 running
39 out of 57 running
55 out of 57 running
57 out of 57 running
FAILED
Error: failed to provision VM: Process exited with: 1. Reason was:  ().

Here is my provision.log.

vcap@agent-id-pcfdev-0:/var/pcfdev$ tail -n 20 /var/pcfdev/provision.log
Sun Feb  5 11:07:30 UTC 2017: Attempt 506 failed. Trying again...
Sun Feb  5 11:07:31 UTC 2017: Attempt 507 failed. Trying again...
Sun Feb  5 11:07:32 UTC 2017: Attempt 508 failed. Trying again...
Sun Feb  5 11:07:33 UTC 2017: Attempt 509 failed. Trying again...
Sun Feb  5 11:07:34 UTC 2017: Attempt 510 failed. Trying again...
Sun Feb  5 11:07:35 UTC 2017: Attempt 511 failed. Trying again...
Sun Feb  5 11:07:39 UTC 2017: Attempt 512 failed. Trying again...
Sun Feb  5 11:07:40 UTC 2017: Attempt 513 failed. Trying again...
Sun Feb  5 11:07:41 UTC 2017: Attempt 514 failed. Trying again...
Sun Feb  5 11:07:42 UTC 2017: Attempt 515 failed. Trying again...
Sun Feb  5 11:07:43 UTC 2017: Attempt 516 failed. Trying again...
Sun Feb  5 11:07:44 UTC 2017: Attempt 517 failed. Trying again...
Sun Feb  5 11:07:45 UTC 2017: Attempt 518 failed. Trying again...
Sun Feb  5 11:07:46 UTC 2017: Attempt 519 failed. Trying again...
Sun Feb  5 11:07:47 UTC 2017: Attempt 520 failed. Trying again...
Sun Feb  5 11:07:48 UTC 2017: Attempt 521 failed. Trying again...
Sun Feb  5 11:07:50 UTC 2017: Attempt 522 failed. Trying again...
Sun Feb  5 11:07:51 UTC 2017: Attempt 523 failed. Trying again...
Sun Feb  5 11:07:52 UTC 2017: Attempt 524 failed. Trying again...
Sun Feb  5 11:07:55 UTC 2017: Timed out pinging rep. Failing post-start.

But 'rep' seems running from monit status.

vcap@agent-id-pcfdev-0:/var/pcfdev$ sudo /var/vcap/bosh/bin/monit summary
The Monit daemon 5.2.5 uptime: 56m 

Process 'pcfdev-api'                running
Process 'pcfdev-nginx'              running
Process 'uaa'                       running
Process 'tps_listener'              running
Process 'tps_watcher'               running
Process 'tcp_emitter'               running
Process 'syslog_drain_binder'       running
Process 'syslog-configurator'       running
Process 'stager'                    running
Process 'ssh_proxy'                 running
Process 'routing-api'               running
Process 'route_registrar'           running
Process 'router_configurer'         running
Process 'route_emitter'             running
Process 'rep'                       running
Process 'rabbitmq-server'           running
Process 'rabbitmq-broker'           running
Process 'rabbitmq-broker-route-registrar' running
Process 'rabbitmq-management-route-registrar' running
Process 'nsync_listener'            running
Process 'nsync_bulker'              running
Process 'nats'                      running
Process 'mariadb_ctrl'              running
Process 'galera-healthcheck'        running
Process 'gra-log-purger-executable' running
Process 'cluster_health_logger'     running
Process 'metron_agent'              running
Process 'loggregator_trafficcontroller' running
Process 'localdriver'               running
Process 'localbroker'               running
Process 'haproxy'                   running
Process 'gorouter'                  running
Process 'garden'                    running
Process 'file_server'               running
Process 'etcd'                      running
Process 'etcd_consistency_checker'  running
Process 'doppler'                   running
Process 'consul_agent'              running
Process 'cloud_controller_worker_1' running
Process 'cloud_controller_ng'       running
Process 'cloud_controller_worker_local_1' running
Process 'cloud_controller_worker_local_2' running
Process 'nginx_cc'                  running
Process 'cloud_controller_clock'    running
Process 'process-watcher'           running
Process 'process-destroyer'         running
Process 'cf-redis-broker'           running
Process 'broker-nginx'              running
Process 'cf-mysql-broker'           running
Process 'cf-mysql-route-registrar'  running
Process 'quota-enforcer'            running
Process 'cc_uploader'               running
Process 'blobstore_nginx'           running
Process 'blobstore_url_signer'      running
Process 'bbs'                       running
Process 'auctioneer'                running
System 'system_localhost'           running

Which log should I check next?

sclevine commented 7 years ago
Allocating 3072 MB out of 5967 MB total system memory (5812 MB free).

Can you try with 4 GB (cf dev start -m 4096)?

ynojima commented 7 years ago

Thank you for comment. But I still have same issue with 4GB memory.

ynojima@feedback:~$ cf dev start -m 4096
Using existing image.
Allocating 4096 MB out of 5967 MB total system memory (5773 MB free).
Importing VM...
Starting VM...
Provisioning VM...
Waiting for services to start...
8 out of 57 running
8 out of 57 running
8 out of 57 running
34 out of 57 running
57 out of 57 running
FAILED
Error: failed to provision VM: Process exited with: 1. Reason was:  ().
vcap@agent-id-pcfdev-0:~$ tail -n 20 /var/pcfdev/provision.log 
Mon Feb  6 22:58:38 UTC 2017: Attempt 450 failed. Trying again...
Mon Feb  6 22:58:39 UTC 2017: Attempt 451 failed. Trying again...
Mon Feb  6 22:58:40 UTC 2017: Attempt 452 failed. Trying again...
Mon Feb  6 22:58:41 UTC 2017: Attempt 453 failed. Trying again...
Mon Feb  6 22:58:42 UTC 2017: Attempt 454 failed. Trying again...
Mon Feb  6 22:58:43 UTC 2017: Attempt 455 failed. Trying again...
Mon Feb  6 22:58:44 UTC 2017: Attempt 456 failed. Trying again...
Mon Feb  6 22:58:45 UTC 2017: Attempt 457 failed. Trying again...
Mon Feb  6 22:58:46 UTC 2017: Attempt 458 failed. Trying again...
Mon Feb  6 22:58:47 UTC 2017: Attempt 459 failed. Trying again...
Mon Feb  6 22:58:48 UTC 2017: Attempt 460 failed. Trying again...
Mon Feb  6 22:58:49 UTC 2017: Attempt 461 failed. Trying again...
Mon Feb  6 22:58:50 UTC 2017: Attempt 462 failed. Trying again...
Mon Feb  6 22:58:51 UTC 2017: Attempt 463 failed. Trying again...
Mon Feb  6 22:58:52 UTC 2017: Attempt 464 failed. Trying again...
Mon Feb  6 22:58:53 UTC 2017: Attempt 465 failed. Trying again...
Mon Feb  6 22:58:54 UTC 2017: Attempt 466 failed. Trying again...
Mon Feb  6 22:58:55 UTC 2017: Attempt 467 failed. Trying again...
Mon Feb  6 22:58:57 UTC 2017: Attempt 468 failed. Trying again...
Mon Feb  6 22:58:58 UTC 2017: Timed out pinging rep. Failing post-start.
ynojima commented 7 years ago

Could anyone reopen this issue?

sclevine commented 7 years ago

@mdelillo have you seen this before?

mdelillo commented 7 years ago

@ynojima @sclevine We encountered this issue awhile ago and solved it by increasing the timeout for pinging the rep. It seems like the timeout (10 minutes) isn't long enough in this case. We'll look into why the rep is taking so long to start, it really shouldn't need 10 minutes.

ynojima commented 7 years ago

I confirmed that rep eventually started by executing /var/vcap/jobs/rep/bin/post-start manually now. It output

Fri Feb 10 15:06:29 UTC 2017: Pinging rep...
Fri Feb 10 15:06:31 UTC 2017: Success!
mdelillo commented 7 years ago

I've only ever seen this happen on AWS, the rep is usually pingable in under a minute in virtualbox. What kind of machine are you running PCF Dev from? Is it a physical or virtual machine, how many cores?

ynojima commented 7 years ago

I found my too slow disk was the cause to prevent the rep from running in time. I run PCF-Dev on nested virtualization environment, and I changed to faster storage, the issue was resolved. Thank you for your help.

vlaskovic commented 7 years ago

Hi everybody,

I was experiencing this problem using: cf dev start -s scs; while cf dev start is OK with its default memory allocation - in my case 8GB.

I have manged to start PCF DEV by allocating 12GB of memory.

May someone have a look on this one please? 4GB of memory difference required in case -s scs is added on start?!

Additionally, looks like that this memory is only required on boot and not used anymore, yet reserved since can't be decreased afterwards...

Thank you. Slobodanka Vlaskovic

twoseat commented 7 years ago

I see what I believe is the same issue. I'm running PCF Dev version 0.24.0 (CLI: 4c30d1d, OVA: 0.447.0) using Virtualbox. Everything works fine using cf dev start -c 3, but if I use cf dev start -c 3 -s none startup works fine, but if I cf dev stop and then cf dev start again startup takes 10s of minutes or more (on my last run I had to stop it) before ending with the error mentioned by the OP. The logs requested above are just multiple repeats of the following:

vcap@agent-id-pcfdev-0:~$ tail -n 6 /var/pcfdev/provision.log
++ curl -s -o /dev/null -u admin:admin -w '%{http_code}' http://rabbitmq-broker.local.pcfdev.io/v2/catalog
+ [[ 404 != 200 ]]
+ sleep 1
++ curl -s -o /dev/null -u admin:admin -w '%{http_code}' http://rabbitmq-broker.local.pcfdev.io/v2/catalog
+ [[ 404 != 200 ]]
+ sleep 1
sclevine commented 7 years ago

We received a report that errors like this can be due to poor I/O performance on some systems that can be fixed by reconfiguring the Virtualbox VM like so:

vboxmanage storagectl pcfdev-v0.447.0 --name SATA --hostiocache on

We'll investigate configuring the VM like this in the next release.

sclevine commented 7 years ago

Oh, that said, @twoseat, your error seems like a bug. We shouldn't be waiting for RabbitMQ to start when you've specified no services.

twoseat commented 7 years ago

@sclevine feel free to switch mine to a separate issue if that works best for you, and thanks for investigating!

mdelillo commented 7 years ago

Our latest release (v0.26.0) has fixes for slow disks as well as @twoseat's issue of stopping and starting the VM with no services.

vin33sh commented 7 years ago

I am using cf version 0.26.0

C:\Users\vin>cf dev version PCF Dev version 0.26.0 (CLI: 1eef0f1, OVA: 0.461.0)

but I am still facing same/similar issue #191 by @ksatyajitpatra. C:\Users\vin>cf dev start -s scs Using existing image. Allocating 8139 MB out of 16278 MB total system memory (13715 MB free). Importing VM... Starting VM... Provisioning VM... Waiting for services to start... 7 out of 52 running 7 out of 52 running 7 out of 52 running 7 out of 52 running 28 out of 52 running 32 out of 52 running 52 out of 52 running FAILED Error: failed to provision VM: Process exited with: 1. Reason was: ().

vcap@agent-id-pcfdev-0:/var/pcfdev$ tail -30 provision.log ++ command cf bind-service spring-cloud-broker-worker spring-cloud-broker-rmq ++ cf bind-service spring-cloud-broker-worker spring-cloud-broker-rmq Binding service spring-cloud-broker-rmq to app spring-cloud-broker-worker in org system / space p-spring-cloud-services as admin... OK TIP: Use 'cf restage spring-cloud-broker-worker' to ensure your env variable changes take effect ++ cf start spring-cloud-broker ++ log_and_execute cf start spring-cloud-broker ++ echo '+ cf' start spring-cloud-broker

FAILED StagingError

TIP: use 'cf logs spring-cloud-broker --recent' for more information

vcap@agent-id-pcfdev-0:/var/pcfdev$

eleizel commented 7 years ago

Hi,

I'm also having this exact error. Actually, mi provision.log looks like the following: [...] ++ echo '+ cf' bind-service spring-cloud-broker-worker spring-cloud-broker-rmq

FAILED StagingError

TIP: use 'cf logs spring-cloud-broker --recent' for more information

I have started the virtualbox with the command cf dev start -s all, but also tried with -s scs, with the same results.

Any clues on how to resolve this?

Thanks in advance.

PD: This are the versions I'm working with: PCF Dev version 0.26.0 (CLI: 1eef0f1, OVA: 0.461.0)

sclevine commented 7 years ago

Can you provide the full output of cf dev start? Also, can you trying starting PCF Dev without SCS?

eleizel commented 7 years ago

Hi, sorry for the delay, but here is the full output for the cf dev start -s scs :

C:\Users\sergio.esperalta>cf dev --version PCF Dev version 0.26.0 (CLI: 1eef0f1, OVA: 0.461.0)

C:\Users\sergio.esperalta>cf dev start -s scs Downloading VM... Progress: |====================>| 100% VM downloaded. Allocating 8059 MB out of 16118 MB total system memory (11712 MB free). Importing VM... Starting VM... Provisioning VM... Waiting for services to start... 7 out of 52 running 7 out of 52 running 7 out of 52 running 52 out of 52 running ←[31;1mFAILED←[0m Error: failed to provision VM: Process exited with: 1. Reason was: ().

Also, here is the full output for the cf dev start without SCS:

C:\Users\sergio.esperalta>cf dev start Using existing image. Allocating 4096 MB out of 16118 MB total system memory (13404 MB free). Importing VM... Starting VM... Provisioning VM... Waiting for services to start... 7 out of 56 running 7 out of 56 running 7 out of 56 running 7 out of 56 running 44 out of 56 running 56 out of 56 running


| || || | | | | || | | | | _ || || | | || __|| || | | || || || | | | | || | | | | || _|| | | || || _|| | | | | | | | | || | | | |_| |____||| |__| |____| || is now running. To begin using PCF Dev, please run: cf login -a https://api.local.pcfdev.io --skip-ssl-validation Apps Manager URL: https://local.pcfdev.io Admin user => Email: admin / Password: admin Regular user => Email: user / Password: pass

It appears to boot correctly.

Any insights?

Best regars.

sclevine commented 7 years ago

Hi @eleizel,

Can you try starting with cf dev start -s scs -m 10000?

eleizel commented 7 years ago

Hi @sclevine, I tried what you asked for, but to no avail. Got the same result as for cf dev start -s scs.