Open aronchick opened 2 years ago
Any idea what the diff to support kfp v2 looks like?
If they're significantly different maybe copy pasting the whole directory is fine. And particularly heinous duplication could be factored out into library functions?
yeah, i FEEL like #1 is the preferred. Just duplicate because then it allows us to leave that forever in the advent of some changes.
V2 support SHOULD be trivial - meaning do #2, (it's just an enum flag on compilation) but i'm noticing some weird bugs that are irritating enough that make me want to do #1.
HOWEVER - let's say we do something like add support for parallelization or something. I'd want to continue to support KFP v1 until that's deprecated. So that means wiring up a lot of duplicate code. Maybe that's a challenge we can address when we get there.
Just to jump in, I think the only place pulling out common functionality will be hard is in the jinja templates if those start to diverge (handling logic in templates is a mess). So probably makes sense to have those be duplicated from the get-go.
The actual python code is easy to refactor though, so either approach seems fine to me there. Easiest to duplicate for now?
Here's where we currently support Kubeflow:
https://github.com/SAME-Project/same-project/tree/vertex/sameproject/ops
To support Kubeflow v1, we could: a) Have a second directory named something like 'kubeflowv1', and make that the flag for executing. Pro: Clear separation, if KFP2 diverges, the old version is static, less opportunity for regressions, etc. Con: Massive duplication of code b) Reuse existing kubeflow directory and add a flag for exactly which version of Kubeflow you're executing against Pro: Reduce duplication of code Con: Likely regressions and lots of code paths/control loops to support different flags
Thoughts @Bubblyworld and @lukemarsden