Closed Kristian-ZH closed 9 months ago
I will do a deployment on Sylva later today and update on whether it works there as well.
@Kristian-ZH since this release changes the CRDs we need to decide how to handle that, these changes should be backwards compatible so I think our options are either document manually applying the crds subdir prior to helm upgrade, or consider creating a new -crds
chart.
@ipetrov117 are you aware of any specific handling of this scenario in Sylva for other charts?
I think our options are either document manually applying the crds subdir prior to helm upgrade, or consider creating a new -crds chart.
I was thinking to document that and to leave the decision for extracting crds to a dedicated dir for later because this should be aligned with all the charts, not only Metal3
@ipetrov117 are you aware of any specific handling of this scenario in Sylva for other charts?
@hardys, I am not aware of any specific convention that is being followed. AFAIK Sylva use whatever the main chart provides in terms of deployment logic. For instance, if the chart foo
uses 2 charts for its deployment logic (foo-crds
and foo
), then 2 units are created in Sylva - foo-crds
and foo
, where foo
depends on foo-crds
. Example can be found here.
My 2 cents on the metal3 chart upgrade discussion.. There doesn't seem to be a concrete approach when it comes to upgrading CRDs. Looking at the helm doc regarding this, we have 2 possible approaches:
crds
directory in the chart and leave helm to deploy the crds for us. This is the current approach that we and Sylva seem to be taking. While easier, this approach does not handle CRD upgrades well (at least from what I understand).foo-crd
and foo
charts, where foo-crd
is always deployed before foo
.IMO documenting the manual CRD sync steps is an okay workaround as of this moment, but for the long term IMO we should think about option (2) mainly for 2 reasons:
@Kristian-ZH I tested in the metal3-demo environment and I think we have an issue with UEFI, we see the BMH fail to start inspection like:
Normal InspectionError 54m metal3-baremetal-controller Failed to inspect hardware. Reason: unable to start inspection: Failed to download image http://192.168.125.10:6180/uefi_esp.img, reason: HTTPConnectionPool(host='192.168.125.10', port=6180): Max retries exceeded with url: /uefi_esp.img (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f5602727c50>: Failed to establish a new connection: [Errno 113] EHOSTUNREACH'))
Looking at the ironic config I think we can see the problem:
shardy@localhost:~> kubectl exec -n metal3-system metal3-metal3-ironic-677bc5c8cc-8cqdg -c ironic -- cat /etc/ironic/ironic.conf | grep uefi_esp
bootloader = http://192.168.125.10:6180/uefi_esp.img
shardy@localhost:~> kubectl get service -n metal3-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
baremetal-operator-controller-manager-metrics-service ClusterIP 10.43.95.163 <none> 8443/TCP 86m
baremetal-operator-webhook-service ClusterIP 10.43.210.62 <none> 443/TCP 86m
metal3-mariadb ClusterIP 10.43.136.41 <none> 3306/TCP 86m
metal3-metal3-ironic LoadBalancer 10.43.57.230 192.168.125.10 6185:31785/TCP,5050:30890/TCP,6385:30479/TCP 86m
Additional note, in the previous ironic image we had
bootloader = {{ env.IRONIC_BOOT_BASE_URL }}/uefi_esp.img
But I think due to the upstream rebase that got lost and is now e.g
bootloader = http://{{ env.IRONIC_URL_HOST }}:{{ env.HTTP_PORT }}/uefi_esp.img
I re-tested after https://build.opensuse.org/request/show/1144913 merged and all works OK now in the metal3-demo environment
This release includes the following changes:
2023.2
dependencies