[ ] Update ProwJobs to run on a community-owned cluster (including setting resources)
[ ] Update presets to point to secrets with the new vSphere config (URL, credentials, thumbprint, ...) and VPN credentials
Open points
Networking
Requirements:
ProwJobs (tests & janitor) need access to the vCenter API and to VMs running inside of vCenter
VMs running inside of vCenter need access to the vCenter API
Current implementation in VMC: (VPN tunnel)
VPN VM with public IP running within vCenter
vCenter and VM IPs are not public
ProwJobs get VPN certificates & config via presets
Advantages:
We already have a working implementation, we just have to replicate it
No restrictions regarding how many IPs we can use for VMs within vCenter because they are private (we need ~ at least 1024, more would be better)
Alternatives to be explored: (sorry didn't understand the entire discusison in the meeting, just chime in below)
Expose vCenter API & VM IPs publicly
I really would like to avoid this for security reasons
Peering between existing Prow cluster and GVE instance
Additional Prow cluster for vSphere jobs in the same private network as the GVE instance
Authentication / Authorization (Okta?)
Requirements:
vCenter access for the following users:
for tests: (technical users)
cluster-api-provider-vsphere
cloud-provider-vsphere
image-builder
for cleanup: (technical user)
janitor (currently implemented as periodic ProwJob, cleans up resources from Boskos)
administrative access:
@sbueringer @chrischdi @fabriziopandini
Boskos configuration & presets
The following describes our current setup in VMC. We would like to use the same in the new GVE environment, using the same in GVE will also make the migration simpler and faster.
Notes:
vCenter:
Resource pools and folders have the following structure, e.g. /prow/cluster-api-provider-vsphere/{001, 002, ...}
This allows us to track resource usage per repository/project
One user per project (which only has permissions on the corresponding project resource pool & folder)
This ensures we have isolation between projects
One user for janitor which has access to all project resource pools / folders to cleanup
Presets:
VPN credentials and the respective user credentials are injected into the ProwJobs via presets
Boskos:
Contains one resource for each (resource pool, folder) pair (user data also contains the corresponding IP pool configuration)
We use different resource types for the different repositories/projects
I checked all jobs that are still using the current vSphere environment and also the ones that are still using credentials from a VMware-owned GCP project to push images for: cluster-api-provider-vsphere, cloud-provider-vsphere, vsphere-csi-driver and image-builder. No suprises there.
The following jobs can be migrated once the new env is functional:
* cluster-api-provider-vsphere:
* `periodic-cluster-api-provider-vsphere-e2e-{{ $mode }}-{{ ReplaceAll $.branch "." "-" }}`
* `periodic-cluster-api-provider-vsphere-e2e-{{ $mode }}-conformance-{{ ReplaceAll $.branch "." "-" }}`
* `periodic-cluster-api-provider-vsphere-e2e-{{ $mode }}-conformance-ci-latest-{{ ReplaceAll $.branch "." "-" }}`
* `periodic-cluster-api-provider-vsphere-janitor`
* `periodic-cluster-api-provider-vsphere-e2e-exp-kk-alpha-features`
* `periodic-cluster-api-provider-vsphere-e2e-exp-kk-serial`
* `periodic-cluster-api-provider-vsphere-e2e-exp-kk-slow`
* `periodic-cluster-api-provider-vsphere-e2e-exp-kk`
* `periodic-cluster-api-provider-vsphere-e2e-{{ $mode }}-upgrade`
* `pull-cluster-api-provider-vsphere-e2e-{{ $mode }}-blocking-{{ ReplaceAll $.branch "." "-" }}`
* `pull-cluster-api-provider-vsphere-e2e-{{ $mode }}-{{ ReplaceAll $.branch "." "-" }}`
* `pull-cluster-api-provider-vsphere-e2e-{{ $mode }}-upgrade`
* `pull-cluster-api-provider-vsphere-e2e-{{ $mode }}-conformance-{{ ReplaceAll $.branch "." "-" }}`
* `pull-cluster-api-provider-vsphere-e2e-{{ $mode }}-conformance-ci-latest-{{ ReplaceAll $.branch "." "-" }}`
* `pull-cluster-api-provider-vsphere-janitor-main`
* cloud-provider-vsphere:
* `pull-cloud-provider-vsphere-e2e-test`
* `pull-cloud-provider-vsphere-e2e-test-on-latest-k8s-version`
* `pull-cloud-provider-vsphere-e2e-test-1-26-minus`
* image-builder:
* `pull-ova-all`
The following jobs can be migrated today: (I talked to the maintainers of vsphere-csi-driver about it)
* vsphere-csi-driver:
* `post-vsphere-csi-driver-deploy`
* `post-vsphere-csi-driver-release`
Context
Prerequisites
Critical path
Open points
Networking
Requirements:
Current implementation in VMC: (VPN tunnel)
Alternatives to be explored: (sorry didn't understand the entire discusison in the meeting, just chime in below)
Authentication / Authorization (Okta?)
Requirements:
Boskos configuration & presets
The following describes our current setup in VMC. We would like to use the same in the new GVE environment, using the same in GVE will also make the migration simpler and faster.
Notes:
(picture source on https://github.com/sbueringer/k8s.io/pull/1, can be opened with drawio) (current Boskos setup in the old VMC environment can be seen here: https://github.com/sbueringer/k8s.io/pull/1)
Jobs that still have to be migrated
I checked all jobs that are still using the current vSphere environment and also the ones that are still using credentials from a VMware-owned GCP project to push images for: cluster-api-provider-vsphere, cloud-provider-vsphere, vsphere-csi-driver and image-builder. No suprises there.