How to support Kubeflow v1

SAME-Project / same-project

https://sameproject.ml/

Apache License 2.0

19 stars 8 forks source link

How to support Kubeflow v1 #124

Open aronchick opened 2 years ago

aronchick commented 2 years ago

Here's where we currently support Kubeflow:

https://github.com/SAME-Project/same-project/tree/vertex/sameproject/ops

To support Kubeflow v1, we could: a) Have a second directory named something like 'kubeflowv1', and make that the flag for executing. Pro: Clear separation, if KFP2 diverges, the old version is static, less opportunity for regressions, etc. Con: Massive duplication of code b) Reuse existing kubeflow directory and add a flag for exactly which version of Kubeflow you're executing against Pro: Reduce duplication of code Con: Likely regressions and lots of code paths/control loops to support different flags

Thoughts @Bubblyworld and @lukemarsden

lukemarsden commented 2 years ago

Any idea what the diff to support kfp v2 looks like?

lukemarsden commented 2 years ago

If they're significantly different maybe copy pasting the whole directory is fine. And particularly heinous duplication could be factored out into library functions?

aronchick commented 2 years ago

yeah, i FEEL like #1 is the preferred. Just duplicate because then it allows us to leave that forever in the advent of some changes.

V2 support SHOULD be trivial - meaning do #2, (it's just an enum flag on compilation) but i'm noticing some weird bugs that are irritating enough that make me want to do #1.

HOWEVER - let's say we do something like add support for parallelization or something. I'd want to continue to support KFP v1 until that's deprecated. So that means wiring up a lot of duplicate code. Maybe that's a challenge we can address when we get there.

Bubblyworld commented 2 years ago

Just to jump in, I think the only place pulling out common functionality will be hard is in the jinja templates if those start to diverge (handling logic in templates is a mess). So probably makes sense to have those be duplicated from the get-go.

The actual python code is easy to refactor though, so either approach seems fine to me there. Easiest to duplicate for now?