AICoE / aicoe-ci

AICoE-CI using TektonCD pipelines and triggers
13 stars 13 forks source link

Run an AICoE-CI pipeline to build artifacts for ppc64 #106

Open goern opened 3 years ago

goern commented 3 years ago

Is your feature request related to a problem? Please describe. As an Operator/User of OpenShift on Power9 I want to run a deployment of AICoE-CI so that I can build ppc64 artifacts/container images/golang binaries

Describe alternatives you've considered manual building on OCP/ppc64

Additional context This will mainly be used for Kubeflow on Power9 work by IBM @lehrig @mgiessing could you provide a little bit of info/script on how you build artifacts?

/kind feature /priority important-soon

goern commented 3 years ago

an interim goal could be to deploy this on a temp OCP/ppc64 deployment, but the target needs to be to deploy this to Op1st.

@durandom

harshad16 commented 3 years ago

power 9 builds need power9 systerm, it means we deploy a pipeline on a power9 system.

goern commented 3 years ago

Right, there is an activity on Op1st to set up OCP4.6/ppc64 and Marvin and Sebastian have access to P9 too, so we can test.

lehrig commented 3 years ago

Kubeflow consists of several components - so let's start with a first (simple) one and then add one after the other. I'd go for kfctl first, as it is simple and we already committed upstream (https://github.com/kubeflow/kfctl/pull/459).

Essentially, building it is super easy; roughly like this:

git clone --branch v1.2.0 https://github.com/kubeflow/kfctl.git
make -f kfctl/Makefile

Hence, I think this is a good first example to be tested end-to-end.

mgiessing commented 3 years ago

power 9 builds need power9 systerm, it means we deploy a pipeline on a power9 system.

I think this depends on what you actually want to compile. Golang is pretty nice in that way as it allows cross-compilation out-of-the-box for several architectures (including ppc64le). The example Sebastain mentioned can be run on x86 and will still produce an ppc64le artifact if GOARCH=ppc64le is set.

Kubeflow consists of several components - so let's start with a first (simple) one and then add one after the other. I'd go for kfctl first, as it is simple and we already committed upstream (kubeflow/kfctl#459).

Essentially, building it is super easy; roughly like this:

git clone --branch v1.2.0 https://github.com/kubeflow/kfctl.git
make -f kfctl/Makefile

Hence, I think this is a good first example to be tested end-to-end.

This is a good example of which was an easy port for ppc64le, but the PR is not backported. Therefore the branch v1.2.0 will just produce Linux/Darwin/Windows binaries for x86 & Linux for ARM. The latest commit at the master branch #486 has the flag GOARCH=ppc64le enabled.

For C & Python it can vary from just recompiling the source code on a Power System up to changing the source code if there is x86 specific parts in it (e.g. MKL exists only for x86)

lehrig commented 3 years ago
  1. I'd still go with kubectl first because of its simplicity
  2. I guess we have to distinguish nightly builds (--branch master) and release builds (--branch v1.3.0 once the release comes out)
  3. in case of Golang, we should eventually clarify with the Kubeflow community whether it will be build on their build server or go to our Op1st environment
goern commented 3 years ago

hey all, do we have some movement on this?

lehrig commented 3 years ago

We'd be ready once the OpenShift cluster on Power is stable and we got access to it. To our knowledge, this has not happened yet. What's the status there? Anything we can do before?

goern commented 3 years ago

(sadly) very true, see https://operatefirst.slack.com/archives/C01TSGYT0R4/p1621515928005000

sesheta commented 2 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

sesheta commented 2 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle rotten

harshad16 commented 2 years ago

/lifecycle frozen

waiting on power9 systems