cloudfoundry / cf-for-k8s

The open source deployment manifest for Cloud Foundry on Kubernetes
Apache License 2.0
300 stars 115 forks source link

Modify eirini build to use new helmless eirini yaml #616

Closed mnitchev closed 3 years ago

mnitchev commented 3 years ago

Do not apply before eirini release v3.0.0

This is a draft PR to make sure our changes in the upcoming v3.0.0 release for eirini is compatible with cf-for-k8s. Please do not merge this until that version is released, but please take the time to review it. Our goal for the pure yaml release is to be as convenient as possible for cf-for-k8s to use - this means no overlays will be necessary when building or deploying. To see what these yaml releases look like, follow the build instructions on how to render them.

Update 12/02/2021

v3.0.0 has now been released and this PR can now be merged

WHAT is this change about?

Change build script for eirini and the templates in config to use the new helmless eirini release.

Does this PR introduce a change to config/values.yml?

No

Acceptance Steps

  1. Vendor eirini v3.0.0
  2. Run the build script for eirini
  3. kapp deploy cf-for-k8s
  4. see that everything works as normal

Tag your pair, your PM, and/or team

@cloudfoundry/eirini

Relevant context:

https://www.pivotaltracker.com/story/show/173896604

linux-foundation-easycla[bot] commented 3 years ago

CLA Signed

The committers are authorized under a signed CLA.

cf-gitbot commented 3 years ago

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/176655322

The labels on this github issue will be updated when the story is started.

cf-rel-int-status-bot commented 3 years ago

Hello friend, it looks like your pull request has failed one or more of our checks. Please take a look! :eyes:

cf-rel-int-status-bot commented 3 years ago

Hello friend, it looks like your pull request has failed one or more of our checks. Please take a look! :eyes:

linux-foundation-easycla[bot] commented 3 years ago

CLA Signed

The committers are authorized under a signed CLA.

cf-rel-int-status-bot commented 3 years ago

Hello friend, it looks like your pull request has failed one or more of our checks. Please take a look! :eyes:

cf-rel-int-status-bot commented 3 years ago

Hello friend, it looks like your pull request has failed one or more of our checks. Please take a look! :eyes:

cf-rel-int-status-bot commented 3 years ago

Hello friend, it looks like your pull request has failed one or more of our checks. Please take a look! :eyes:

cf-rel-int-status-bot commented 3 years ago

Hello friend, it looks like your pull request has failed one or more of our checks. Please take a look! :eyes:

cf-rel-int-status-bot commented 3 years ago

Hello friend, it looks like your pull request has failed one or more of our checks. Please take a look! :eyes:

cf-rel-int-status-bot commented 3 years ago

Hello friend, it looks like your pull request has failed one or more of our checks. Please take a look! :eyes:

cf-rel-int-status-bot commented 3 years ago

Hello friend, it looks like your pull request has failed one or more of our checks. Please take a look! :eyes:

cf-rel-int-status-bot commented 3 years ago

Hello friend, it looks like your pull request has failed one or more of our checks. Please take a look! :eyes:

cf-rel-int-status-bot commented 3 years ago

Hello friend, it looks like your pull request has failed one or more of our checks. Please take a look! :eyes:

cf-rel-int-status-bot commented 3 years ago

Hello friend, it looks like your pull request has failed one or more of our checks. Please take a look! :eyes:

cf-rel-int-status-bot commented 3 years ago

Hello friend, it looks like your pull request has failed one or more of our checks. Please take a look! :eyes:

cf-rel-int-status-bot commented 3 years ago

Hello friend, it looks like your pull request has failed one or more of our checks. Please take a look! :eyes:

cf-rel-int-status-bot commented 3 years ago

Hello friend, it looks like your pull request has failed one or more of our checks. Please take a look! :eyes:

cf-rel-int-status-bot commented 3 years ago

Hello friend, it looks like your pull request has failed one or more of our checks. Please take a look! :eyes:

cf-rel-int-status-bot commented 3 years ago

Hello friend, it looks like your pull request has failed one or more of our checks. Please take a look! :eyes:

cf-rel-int-status-bot commented 3 years ago

Hello friend, it looks like your pull request has failed one or more of our checks. Please take a look! :eyes:

acosta11 commented 3 years ago

Hi Eirini friends,

Thanks for making the updates to be compatible with the vendir sync checks in CI per our slack discussion in https://cloudfoundry.slack.com/archives/C8RU3BZ26/p1613073589027800. This time we see an error in source app push on KinD and a crash event error in CATs.

Regarding the crash events, the previous configuration in https://github.com/cloudfoundry/cf-for-k8s/commit/0dd4940c144ff1d6ff9ce4a37c1e73050aae1c7e looks slightly different than what we see in the current diff. It looks like the eirini-events selector was renamed to eirini-event-reporter. So maybe the name/selector mismatch in our separately managed network policies is the issue here.

$ ag eirini-events
config/networking/istio-authorization-policies.yml
33:            - #@ principal(system_namespace(), "eirini-events")
config/networking/network-policies.yaml
556:  name: eirini-events

We're otherwise not sure about the KinD issue, so hopefully the networking reconciliation will also take care of that.

Thanks, Andrew and @jamespollard8

cf-rel-int-status-bot commented 3 years ago

Hello friend, it looks like your pull request has failed one or more of our checks. Please take a look! :eyes:

danail-branekov commented 3 years ago

Thank you for pointing that out, indeed we have overlooked the auth policy yml. This is now fixed.

cf-rel-int-status-bot commented 3 years ago

Hello friend, it looks like your pull request has failed one or more of our checks. Please take a look! :eyes:

cf-rel-int-status-bot commented 3 years ago

Hello friend, it looks like your pull request has failed one or more of our checks. Please take a look! :eyes:

kieron-dev commented 3 years ago

This time we see an error in source app push

Regarding this, we also see this in kind. The pod in cf-workloads-staging shows the problem described in issue https://github.com/cloudfoundry/cf-for-k8s/issues/444.

We upgraded kind to the latest 0.10.0, but the problem persisted. When we applied the suggested patch from the issue to deploy/kind/cluster.yml, this fixed the problem. So it seems that the cb835ec7490f5ab068083d04d440ee28898c1cb9 commit reverting that patch is a little premature.

jspawar commented 3 years ago

@kieron-dev yes it looks like the commit to remove that patch was a bit too optimistic and we've reverted that.

We however cannot re-trigger the PR checks because it looks like there is a merge conflict we were hoping you could resolve first please.

From our end, I believe we made changes to this rendered.yml file to address a separate issue you all caught about the version mismatch of your images we're using: https://github.com/cloudfoundry/cf-for-k8s/pull/633

gcapizzi commented 3 years ago

Hey team! We think this is ready but:

  1. How do you test your PRs? We've some brief manual testing but couldn't run CATs, how do we get enough confidence that this works?
  2. osl-compliant-image-override.yml still replaces our images with your builds which I guess are still pointing at 2.0 - I guess you are going to build new OSL-compliant images for 3.0 and update that file yourselves?

Thanks! /cc @cloudfoundry/eirini

cf-rel-int-status-bot commented 3 years ago

Hello friend, it looks like your pull request has failed one or more of our checks. Please take a look! :eyes:

cf-rel-int-status-bot commented 3 years ago

Hello friend, it looks like your pull request has failed one or more of our checks. Please take a look! :eyes:

gcapizzi commented 3 years ago

Yeah it looks like smoke tests are failing :/ We'll take a look, bear with us!

gcapizzi commented 3 years ago

Hey @jamespollard8 @acosta11 @jspawar! I just got the smoke tests to pass on my machine, and looking at the output of the failing builds (here and here) I can't really tell what the problem is 😞 can you help?

acosta11 commented 3 years ago

Hi @gcapizzi ,

Thanks for the updates and the quick follow up on the smoke test issue! From the slack thread, I think I see what the issue is on KinD now:

I was able to reproduce locally on KinD w/ k8s v16.15 as we use in CI.

It looks like the app was hitting an image pull backoff issue:

Warning  Failed     2m40s (x3 over 3m24s)  kubelet            Failed to pull image "<app image>": rpc error: code = Unknown desc = failed to pull and unpack image "<app iamge>": failed to resolve reference "<app image>": unexpected status code [manifests <sha256 digest>]: 401 Unauthorized
  Warning  Failed     2m40s (x3 over 3m24s)  kubelet            Error: ErrImagePull
  Normal   BackOff    2m14s (x4 over 3m23s)  kubelet            Back-off pulling image "<app image>"
  Warning  Failed     2m14s (x4 over 3m23s)  kubelet            Error: ImagePullBackOff

And then I saw that there weren’t any image pull secrets on pod spec, nor was there any reference to an app registry secret in any of the service accounts in use in the workloads namespace. It looks like there is a reference mismatch between the opi configmap and the secret created in cf-for-k8s for it to consume.

config/eirini/eirini.yml in cf-for-k8s

apiVersion: v1
kind: Secret
metadata:
  name: app-registry-credentials

config/eirini/_ytt_lib/rendered.yml in cf-for-k8s from the output of helm template on eirini release

apiVersion: v1
data:
  opi.yml: |
    app_namespace: cf-workloads
     ...
    registry_secret_name: registry-credentials

After syncing those references up, the node app push succeeded for me.

gcapizzi commented 3 years ago

Thank you @acosta11! My local setup is using a public Docker registry, I think that's why I'm not seeing it fail locally 🙁 I've just pushed a fix 🤞 As mentioned here you'd still have to update the OSL image references but other than that, I think we're good!

cf-rel-int-status-bot commented 3 years ago

Hello friend, it looks like your pull request has failed one or more of our checks. Please take a look! :eyes:

gcapizzi commented 3 years ago

Ok external-db-rds is failing now, although it used to pass, so maybe a flake?

jspawar commented 3 years ago

@gcapizzi appears to have been a flake, everything is all green now though so we're going to go ahead and merge! Thanks for all the help and patience getting this through y'all