vmware-archive / gcp-pcf-quickstart

Install Pivotal Cloud Foundry on Google Cloud Platform With One Command
Apache License 2.0
57 stars 28 forks source link

Update to PCF v2.3 #34

Closed mattysweeps closed 5 years ago

rkoster commented 5 years ago

Latest issue I'm facing with: https://github.com/cf-platform-eng/gcp-pcf-quickstart/commits/up-sw-patches is related to the database instance.

In opsman:

 Error: 'control/2ad95507-4d29-4185-836f-ea928a888bb8 (0)' is not running after update.

on control node bosh ssh control:

monit summary
...
Process 'credhub'                   Does not exist
tail -f /var/vcap/sys/log/credhub/credhub.log
...
Could not connect to mysql.service.cf.internal:3306 : unexpected end of stream, read 0 bytes from 4 (socket was closed by server)

on database node bosh ssh database:

tail -f  /var/vcap/sys/log/proxy/proxy.combined.log
...
{"timestamp":"1542017550.703610897","source":"/var/vcap/packages/proxy/bin/proxy","message":"/var/vcap/packages/proxy/bin/proxy.Healthcheck failed on backend","log_level":2,"data":{"backend":{"host":"10.0.4.21","port":13306,"status_port":9200,"healthy":false,"name":"backend-0","currentSessionCount":0},"endpoint":"http://10.0.4.21:9200/api/v1/status","err":{"Op":"Get","URL":"http://10.0.4.21:9200/api/v1/status","Err":{"Op":"dial","Net":"tcp","Source":null,"Addr":{"IP":"10.0.4.21","Port":9200,"Zone":""},"Err":{"Syscall":"getsockopt","Err":111}}},"error":"Error during healthcheck http get","resp":"(*http.Response)(nil)"}}

Monit does not know about a mysql process:

monit summary
Process 'consul_agent'              running
Process 'metron_agent'              running
Process 'nats'                      running
Process 'route_registrar'           running
Process 'proxy'                     running
Process 'mysql-metrics'             running
Process 'mysql-diag-agent'          running
Process 'bosh-dns'                  running
Process 'bosh-dns-resolvconf'       running
Process 'bosh-dns-healthcheck'      running

Also monit file is empty:

cat /var/vcap/jobs/mysql/monit

Looking at monit template: We find cf_mysql_enabled which defaults to true

Looking at OpsMan we find it's disabled:

bosh int <(om -k curl --path /api/v0/staged/products/cf-18c7358045c1d2058b00/manifest) --path /manifest/instance_groups/name=database/jobs/name=mysql/properties
...
cf_mysql_enabled: false
rkoster commented 5 years ago

Today I have tried to get credhub to use an external db, because of the ca cert issue I have tried different ways of getting a custom ca-cert for google cloud sql. I'm however getting a 403 from gcp when using terraform like this: https://github.com/starkandwayne/gcp-pcf-quickstart/commit/f0a00756c249770e8988afed541933029c7d5615

Last I tried creating a dedicated load balancer but it's still work in progress: https://github.com/starkandwayne/gcp-pcf-quickstart/commit/5a1fddddba50dab248e0a323c464eec1d3655f23

Just wanted to document what has been tried so far @mattysweeps ^

mattysweeps commented 5 years ago

@rkoster

Apologies for not responding last night. I discovered that the branch https://github.com/cf-platform-eng/gcp-pcf-quickstart/tree/up-sw-patches was able to deploy a large footprint without error. So only the small footprint is breaking.

Right now I'm investigating whether this is an issue with the small footprint itself or how we configure it.

I'll update when I know more.

mattysweeps commented 5 years ago

Closing this since #51 has been merged.