Closed mar1ged closed 8 years ago
Hi @mar1ged,
Once the OVA is downloaded, PCF Dev should start very quickly (usually in less than 10 minutes) on hardware that is much less powerful than the hardware you describe.
Are you running cf dev start
from within another VM?
Is there any chance the I/O performance of the disk that holds the PCF Dev vmdk file is severely degraded? We've seen these types of errors on very slow disks.
-Stephen
@sclevine: No, I run this on bare metal (Hardware mentioned as above, CentOS 7.2 x64)
I guess I/O performance should not be a problem:
/dev/sda:
Timing cached reads: 10342 MB in 2.00 seconds = 5175.38 MB/sec
Timing buffered disk reads: 1728 MB in 3.00 seconds = 575.07 MB/sec
Hi @mar1ged,
Can you provide the last 200 lines of run.log
, with any sensitive info sanitized?
Thanks, Stephen
@sclevine The last hundreds of lines are identical to this:
++ available_buildpacks
++ cf curl /v2/buildpacks
++ jq '.resources | map(select(.entity.filename | length > 0)) | length'
+ [[ 7 -lt 8 ]]
+ sleep 1
++ available_buildpacks
++ cf curl /v2/buildpacks
++ jq '.resources | map(select(.entity.filename | length > 0)) | length'
+ [[ 7 -lt 8 ]]
+ sleep 1
++ available_buildpacks
++ cf curl /v2/buildpacks
++ jq '.resources | map(select(.entity.filename | length > 0)) | length'
+ [[ 7 -lt 8 ]]
+ sleep 1
++ available_buildpacks
++ cf curl /v2/buildpacks
++ jq '.resources | map(select(.entity.filename | length > 0)) | length'
The available_buildpacks
block exists 1301 times in the log
This means that a single buildpack (if I had to guess, the Java buildpack) cannot be registered with Cloud Controller. If you cf curl /v2/buildpacks
, can you provide the output here?
If you're positive that your VM isn't running out of space (check df -h
from inside the VM), you could tar up the /var/vcap/sys/log
directory and send it to me at slevine@pivotal.io. Be aware that those logs may contain sensitive information that you may want to remove, depending on how you use your PCF Dev VM.
The output is (I run this inside the VM, correct ?):
No API endpoint set. Use 'cf login' or 'cf api' to target an endpoint.
FAILED
Error creating request:
Error performing request: Get /v2/buildpacks: unsupported protocol scheme ""
When I try to cf login
from inside the VM I get:
vagrant@agent-id-pcfdev-0:~$ cf login
API endpoint> admin
a
Something unexpected happened. This is a bug in cf.
Please re-run the command that caused this exception with the environment
variable CF_TRACE set to true.
Also, please update to the latest cli and try the command again:
https://github.com/cloudfoundry/cli/releases
Please create an issue at: https://github.com/cloudfoundry/cli/issues
Is this the time when I send you the logs ? ;-)
Sure, let me know if you need more detailed instructions.
Can you try cf curl /v2/buildpacks
as the vcap
user instead of the vagrant
user?
Unfortunately this fails:
vagrant@agent-id-pcfdev-0:~$ su vcap
Password:
vcap@agent-id-pcfdev-0:/home/vagrant$ cf curl /v2/buildpacks
FAILED
Error read/writing config: open /home/vcap/.cf/config.json: permission denied
Apologies, this is a known issue in 0.17.0 that's fixed in the latest internal RCs.
You can fix the cf
CLI with: $ chown -R vcap:vcap /home/vcap/.cf
Thanks @sclevine , now I get proper JSON and after a cf dev stop
and cf dev start
the whole process is faster (real 11m49.627s
, not famous but ok ;-)) and no longer stops and the call returns to the command line properly.
Just out of curiosity I checked the run.log and saw this, please tell me if it is relevant and if I should open a new issue:
+ cf start apps-manager
Starting app apps-manager in org system / space system as admin...
Downloading staticfile_buildpack...
Downloaded staticfile_buildpack (2.5M)
Creating container
Successfully created container
Downloading app package...
Downloaded app package (1.3M)
Staging...
-------> Buildpack version 1.3.6
Downloaded [file:///tmp/buildpacks/dcb75c29aed8dc561d63627253ca3d82/dependencies/https___pivotal-buildpacks.s3.amazonaws.com_concourse-binaries_nginx_nginx-1.9. 14-linux-x64.tgz]
grep: Staticfile: No such file or directory
-----> Using root folder
-----> Copying project files into public/
-----> Setting up nginx
grep: Staticfile: No such file or directory
grep: Staticfile: No such file or directory
Exit status 0
Staging complete
Uploading droplet, build artifacts cache...
Uploading build artifacts cache...
Uploading droplet...
Uploaded build artifacts cache (109B)
Uploaded droplet (3.9M)
Uploading complete
Could not fetch instance count: Server error, status code: 503, error code: 2200 02, message: Instances information unavailable: Net::ReadTimeout
Hi @mar1ged,
Are you unable to use Apps Manager after cf dev start
ing and encountering that error in the run.log
?
Also, can you try out 0.18.0 and let me know if it improves these issues?
@sclevine I tried 0.18.0 today (after removing everything from the previous version and updating cf cli) and it again hangs after the 50 out of 50 running
:-(
I can ssh into the box (which has now 192.168.22.11 btw) and it looks like we have the same issue with the buildpacks again:
+ sleep 1
++ available_buildpacks
++ cf curl /v2/buildpacks
++ jq '.resources | map(select(.entity.filename | length > 0)) | length'
Even after I chown
and destroyed the VM (stop / start didn't work because it was defunct) the whole thing loops again.
@sclevine I gave the whole thing another try and purged everything from disk and downloaded the VM again.
Now the cf dev start
has been at 46 out of 50
for more than half an hour and this is what I can see from the logs (I interrupted this today after 10 hours):
++ [[ The Monit daemon 5.2.5 uptime: 40m
Process 'pcfdev-nginx' running
Process 'mariadb_ctrl' running
Process 'galera-healthcheck' running
Process 'gra-log-purger-executable' running
Process 'garden' running
Process 'etcd' running
Process 'auctioneer' running
Process 'cc_uploader' running
Process 'converger' running
Process 'file_server' running
Process 'nsync_listener' running
Process 'nsync_bulker' running
Process 'rep' running
Process 'route_emitter' running
Process 'ssh_proxy' running
Process 'stager' running
Process 'tps_listener' running
Process 'tps_watcher' running
Process 'bbs' running
Process 'syslog_drain_binder' running
Process 'consul_agent' not monitored
Process 'blobstore_nginx' running
Process 'blobstore_url_signer' running
Process 'metron_agent' running
Process 'nats' running
Process 'route_registrar' running
Process 'cloud_controller_clock' running
Process 'cloud_controller_ng' running
Process 'cloud_controller_worker_local_1' running
Process 'cloud_controller_worker_local_2' running
Process 'nginx_cc' running
Process 'cloud_controller_migration' running
Process 'cloud_controller_worker_1' running
Process 'uaa' running
Process 'doppler' running
Process 'gorouter' running
Process 'loggregator_trafficcontroller' Execution failed
Process 'syslog-configurator' running
Process 'process-watcher' running
Process 'cf-redis-broker' running
Process 'broker-nginx' running
Process 'cf-redis-route-registrar' running
Process 'cf-mysql-broker' Execution failed
Process 'cf-mysql-route-registrar' running
Process 'quota-enforcer' initializing
Process 'rabbitmq-broker' running
Process 'rabbitmq-broker-route-registrar' running
Process 'rabbitmq-management-route-registrar' running
Process 'rabbitmq-server' running
System 'system_localhost' running = *\e\r\r\o\r\ \c\o\n\n\e\c\t\i\n\g\ \t\o\ \t\h\e\ \m\o\n\i\t\ \d\a\e\m\o\n* ]]
++ echo 'The Monit daemon 5.2.5 uptime: 40m
Process '\''pcfdev-nginx'\'' running
Process '\''mariadb_ctrl'\'' running
Process '\''galera-healthcheck'\'' running
Process '\''gra-log-purger-executable'\'' running
Process '\''garden'\'' running
Process '\''etcd'\'' running
Process '\''auctioneer'\'' running
Process '\''cc_uploader'\'' running
Process '\''converger'\'' running
Process '\''file_server'\'' running
Process '\''nsync_listener'\'' running
Process '\''nsync_bulker'\'' running
Process '\''rep'\'' running
Process '\''route_emitter'\'' running
Process '\''ssh_proxy'\'' running
Process '\''stager'\'' running
Process '\''tps_listener'\'' running
Process '\''tps_watcher'\'' running
Process '\''bbs'\'' running
Process '\''syslog_drain_binder'\'' running
Process '\''consul_agent'\'' not monitored
Process '\''blobstore_nginx'\'' running
Process '\''blobstore_url_signer'\'' running
Process '\''metron_agent'\'' running
Process '\''nats'\'' running
Process '\''route_registrar'\'' running
Process '\''cloud_controller_clock'\'' running
Process '\''cloud_controller_ng'\'' running
Process '\''cloud_controller_worker_local_1'\'' running
Process '\''cloud_controller_worker_local_2'\'' running
Process '\''nginx_cc'\'' running
Process '\''cloud_controller_migration'\'' running
Process '\''cloud_controller_worker_1'\'' running
Process '\''uaa'\'' running
Process '\''doppler'\'' running
Process '\''gorouter'\'' running
Process '\''loggregator_trafficcontroller'\'' Execution failed
Process '\''syslog-configurator'\'' running
Process '\''process-watcher'\'' running
Process '\''cf-redis-broker'\'' running
Process '\''broker-nginx'\'' running
Process '\''cf-redis-route-registrar'\'' running
Process '\''cf-mysql-broker'\'' Execution failed
Process '\''cf-mysql-route-registrar'\'' running
Process '\''quota-enforcer'\'' initializing
Process '\''rabbitmq-broker'\'' running
Process '\''rabbitmq-broker-route-registrar'\'' running
Process '\''rabbitmq-management-route-registrar'\'' running
Process '\''rabbitmq-server'\'' running
System '\''system_localhost'\'' running'
+ started=46
+ [[ 46 -lt 50 ]]
+ counter=1893
+ [[ 33 = 0 ]]
+ sleep 1
vcap@agent-id-pcfdev-0:/var/pcfdev$
@sclevine I destroyed the VM and started it again, now the whole thing is running:
time cf dev start
Using existing image.
Allocating 4096 MB out of 48125 MB total system memory (37528 MB free).
Importing VM...
Starting VM...
Provisioning VM...
Waiting for services to start...
9 out of 50 running
9 out of 50 running
9 out of 50 running
47 out of 50 running
50 out of 50 running
_______ _______ _______ ______ _______ __ __
| || || | | | | || | | |
| _ || || ___| | _ || ___|| |_| |
| |_| || || |___ | | | || |___ | |
| ___|| _|| ___| | |_| || ___|| |
| | | |_ | | | || |___ | |
|___| |_______||___| |______| |_______| |___|
is now running.
To begin using PCF Dev, please run:
cf login -a https://api.local2.pcfdev.io --skip-ssl-validation
Admin user => Email: admin / Password: admin
Regular user => Email: user / Password: pass
real 27m28.979s
user 0m12.720s
sys 0m8.976s
Somehow this seems to be random: sometimes it fails, sometimes it succeeds. I was not able to find out what leads to which result. Unfortunately I can not log into the API:
cf login -a https://api.local2.pcfdev.io --skip-ssl-validation -v
API endpoint: https://api.local2.pcfdev.io
REQUEST: [2016-07-27T08:19:22+02:00]
GET /v2/info HTTP/1.1
Host: api.local2.pcfdev.io
Accept: application/json
Content-Type: application/json
User-Agent: go-cli 6.20.0+25b1961 / linux
(cf just sits and waits)
Closing this due to inactivity. Feel free to re-open if you're still having issues with the latest PCF Dev.
When I run PCF Dev it stays at 9 services for a long time and after 50 are started it does not proceed. The screen looks like this:
I am able to ssh into the box and
cf api
works, so perhaps this is just a problem of the start command not returning which causes the missing "PCF Dev is now running". When I interrupt the cf command with ^C bothssh
andcf api
still work.I tried
cf dev destroy
already which does not solve the problem, I even cleaned the.pcfdev
directory (except of thetoken
file) and downloaded everything again.EDIT: after various retries and reboots it suddenly worked. But the startup still takes 28 minutes (on a dual socket quadcore Xeon with 48 GB RAM and 8*SAS RAID ...) and stays at "9 out of 50 ..." for a long time. Because of this I
cf dev destroy
ed the whole thing again and did a newcf dev start
. After more than 40 minutes it still is at stageWaiting for services to start ...
, which is even worse than before ;-) While this is message is on screen I can ssh into the box, but don't know what to check further ...This is what I got from /var/pcfdev/run.log ...
... and ...
EDIT 2: I tried once again, now I have the same effect as at the beginning:
the output stopped after
50 out of 50 running` and nothing else is displayed. I started this cf more than 5 hours ago.