vmware-archive / pcf-pipelines

PCF Pipelines
Apache License 2.0
158 stars 282 forks source link

Implement ability to specify ops manager vm type and volume size #133

Closed xyloman closed 7 years ago

xyloman commented 7 years ago

Implement ability to specify ops manager vm instance type and volume size. I recently ran into an issue with upgrading from 1.11.0 to 1.11.3. Here is the trace from the job running in concourse:

could not execute "import-installation": failed to import installation: request failed: unexpected response:
HTTP/1.1 422 Unprocessable Entity
Transfer-Encoding: chunked
Cache-Control: no-cache
Connection: keep-alive
Content-Type: application/json; charset=utf-8
Date: Sat, 24 Jun 2017 13:55:44 GMT
Server: nginx/1.4.6 (Ubuntu)
Strict-Transport-Security: max-age=31536000
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-Request-Id: a5fbd43d-9ab0-4350-b55d-719751a051e6
X-Runtime: 541.369502
X-Xss-Protection: 1; mode=block
1c6a
{"errors":{"installation_asset_collection":["Zip file is not valid: Archive:  /tmp/RackMultipart20170624-1377-15jdd2t.zip\n   creating: /tmp/ops_manager/d20170624-1377-3b34w/metadata/\n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/metadata/p-redis-1.8.2.yml  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/metadata/Pivotal_Single_Sign-On_Service-1.4.2.yml  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/metadata/p-rabbitmq-1.8.9.yml  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/metadata/p-spring-cloud-services-1.4.0.yml  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/metadata/newrelic-broker-1.9.1.yml  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/metadata/p-metrics-1.9.1.yml  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/metadata/p-mysql-1.9.5.yml  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/metadata/apm-1.3.7.yml  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/metadata/cf-1.11.1.yml  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/installation.yml  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/rails_database_dump.postgres  \n   creating: /tmp/ops_manager/d20170624-1377-3b34w/releases/\n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/cf-backup-and-restore-0.0.3-3421.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/postgres-17.0.0-3421.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/consul-release-145.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/nats-16.0.0-3421.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/java-offline-buildpack-3.17.0-3421.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/go-offline-buildpack-1.8.4-3421.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/loggregator-65.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/uaa-41.0.0-3421.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/statsd-injector-1.0.28-3421.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/loggregator-70.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/redis-service-adapter-0.5.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/nodejs-offline-buildpack-1.5.36-3421.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/diego-1.18.1-3421.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/cf-redis-429.3.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/cf-mysql-34.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/rabbitmq-on-demand-adapter-0.1.10.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/service-metrics-1.5.6.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/consul-163.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/loggregator-89.0.0-3421.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/on-demand-service-broker-0.15.2.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/mysql-monitoring-8.1.4-3421.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/python-offline-buildpack-1.5.19-3421.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/dotnet-core-offline-buildpack-1.0.19-3421.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/cf-networking-0.25.0-3421.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/nfs-volume-1.0.3-3421.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/cf-smoke-tests-21.0.0-3421.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/binary-offline-buildpack-1.0.13-3421.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/service-metrics-1.5.5.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/service-backup-18.1.0.tgz  \n extracting: /tmp/ops_manager/d20170624-1377-3b34w/releases/syslog-configurator-0.3.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/mysql-backup-1.33.0-3421.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/rabbitmq-metrics-1.62.0-rc.1.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/redis-backups-0.16.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/capi-1.28.8-3421.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/cf-mysql-32.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/mysql-monitoring-8.2.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/notifications-ui-28.0.0-3421.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/cf-259.0.6-3421.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/routing-0.143.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/mysql-backup-1.34.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/cf-mysql-35.0.2-3421.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/notifications-36.0.0-3421.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/etcd-104.0.0-3421.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/cflinuxfs2-1.126.0-3421.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/mysql-monitoring-6.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/redis-service-0.5.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/pcf-metrics-app-dev-release-1.3.7.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/pivotal-account-1.6.0-3421.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/staticfile-offline-buildpack-1.4.6-3421.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/syslog-migration-4.0.0-3421.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/scalablesyslog-4.0.0-3421.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/php-offline-buildpack-4.3.34-3421.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/identity_service_broker-71.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/spring-cloud-broker-1.4.0-build.91.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/garden-runc-1.7.0-3421.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/logsearch-boshrelease-204.0.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/consul-167.0.0-3421.9.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/cf-rabbitmq-226.8.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/service-backup-18.0.4.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/redis-metrics-3.7.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/cf-routing-release-0.143.0.tgz  \n  inflating: /tmp/ops_manager/d20170624-1377-3b34w/releases/ruby-offline-buildpack-1.6.40-3421.9.0.tgz  \n/tmp/ops_manager/d20170624-1377-3b34w/releases/ruby-offline-buildpack-1.6.40-3421.9.0.tgz:  write error (disk full?).  Continue? (y/n/^C) \nwarning:  /tmp/ops_manager/d20170624-1377-3b34w/releases/ruby-offline-buildpack-1.6.40-3421.9.0.tgz is probably truncated\n"]}}

The VM in AWS that I am upgrading from is type m3.large with a 100GB volume associated. The image created by the ops manager job is m3.large with a 50 GB volume associated, which I believe is the cause of the error above. Having the ability to declare the instance type and volume size would help fix this issue. We are on pcf-piplines v0.15.0, ERT 1.11.1, and OpsMgr 1.11.0.

cf-gitbot commented 7 years ago

We have created an issue in Pivotal Tracker to manage this. Unfortunately, the Pivotal Tracker project is private so you may be unable to view the contents of the story.

The labels on this github issue will be updated when the story is started.

gondoi commented 7 years ago

We are seeing the same issue. We are unable to use this pipeline at all due to the export being ~12G and import causes the default 50G volume to fill up. We confirmed this is the actual issue by connecting to the instance and watching disk usage as import is happening.

krishicks commented 7 years ago

The bug here is that we didn't create an equivalent VM/disk with the new AMI.

The point of the upgrade pipeline is to basically duplicate the settings of the VM but with a new AMI. I don't think we should try to make it configurable.

gondoi commented 7 years ago

In that case, this actually sounds like a bug in pivotal-cf/cliaas for the replace-vm command.

xyloman commented 7 years ago

Would it be appropriate to log an issue against cliaas then? I think if the replace-vm preserved the settings of the ops manager VM that would be sufficient. We can change the scope of this issue to focus on that. However, if the OpsMan vm needed to get resized would that remain a manual process?

gondoi commented 7 years ago

My opinion is that, yes, reporting this on pivotal-cf/cliaas is probably the only way to get this fixed if it's not considered a bug of the pipeline itself and cliaas itself doesn't even have this as a configurable option. It should already be doing the disk size work as part of the replace-vm.

krishicks commented 7 years ago

Yes, I'd say it's a cliaas issue. On Fri, Jun 30, 2017 at 05:05 BK Box notifications@github.com wrote:

My opinion is that, yes, reporting this on pivotal-cf/cliaas is probably the only way to get this fixed if it's not considered a bug of the pipeline itself and cliaas itself doesn't even have this as a configurable option. It should already be doing the disk size work as part of the replace-vm.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pivotal-cf/pcf-pipelines/issues/133#issuecomment-312250027, or mute the thread https://github.com/notifications/unsubscribe-auth/AAE1_dSm3rZCwudJg5PO3Men4AwfisHIks5sJORwgaJpZM4OFrvF .

ryanpei commented 7 years ago

I'll close this issue then, since we believe this can be addressed appropriately by fixing https://github.com/pivotal-cf/cliaas/issues/4. please re-open if you think we need to address the issue of not being to configure the initial Ops Mgr attached disk size.

xyloman commented 7 years ago

@ryanpei thanks for closing this issue. I think the only feature enhancement is an automated way to change instance type and volume size. We will need a way to orchestrate rolling out instance types as they become available. We will also need an ability to orchestra resizing of the ops manager persistent disk as a resultant of ever growing tile usage.