argoproj-labs / community

Community documents for argoproj-labs
Apache License 2.0
12 stars 6 forks source link

Add argo-workflows-gene to argoproj-labs #13

Open shuangkun opened 1 month ago

shuangkun commented 1 month ago

Welcome to Argo Project Onboarding!

Before submitting the ticket please ensure you understand which projects could be added to the Argo community and what the open decision-making process looks like.

Once you are ready, please help the reviewer understand your project better by answering the following questions in your onboarding proposal:

agilgur5 commented 1 month ago

+1 thanks for bootstrapping this!

  • [x] What is your project license?

For reference, I see the embedded WDL grammar is BSD-3 licensed. I'm not sure if it's used at all or if only the derivative Go parsers are, which might be separately licenseable

argo-workflows-gene

Is there a reason it's called "Gene"? Not sure if that was short for "genetics" or "generator"? "Gene" sounds very ambiguous to me

I might suggest argo-workflows-wdl as it currently only has WDL conversion, but, if more generic, then perhaps argo-workflows-converter or argo-workflows-importer or similar

I also might suggest that rather than having a submit command that you have a convert command. I.e. output Workflow YAML, which can be piped to argo submit, and let the Argo CLI handle networking etc instead of repeating or importing that logic. Unix philosophy / SRP tends to be a lot easier to maintain, focus, and keep loosely coupled.

shuangkun commented 1 month ago

+1 thanks for bootstrapping this!

  • [x] What is your project license?

For reference, I see the embedded WDL grammar is BSD-3 licensed. I'm not sure if it's used at all or if only the derivative Go parsers are, which might be separately licenseable

argo-workflows-gene

Is there a reason it's called "Gene"? Not sure if that was short for "genetics" or "generator"? "Gene" sounds very ambiguous to me

I might suggest argo-workflows-wdl as it currently only has WDL conversion, but, if more generic, then perhaps argo-workflows-converter or argo-workflows-importer or similar

I also might suggest that rather than having a submit command that you have a convert command. I.e. output Workflow YAML, which can be piped to argo submit, and let the Argo CLI handle networking etc instead of repeating or importing that logic. Unix philosophy / SRP tends to be a lot easier to maintain, focus, and keep loosely coupled.

Many of the customers I have met are related to genetics and hope to convert to cloud-native workflow. This industry frequently uses workflows to process data. Bringing industry attributes will be more conducive to project promotion and use, and we need more contributors. I hope to jointly maintain this project with cwl contributors, because these two languages ​​always appear in pairs in scenarios such as scientific computing and genetics.

I hope we can do the conversion, but not just the conversion. It would be best if users can use it directly from one interface, thus shielding the complexity of using Argo Workflows syntax. Especially for researchers, they don't want too many changes, and it's difficult for them to learn the new YAML language. This is one of the reasons why I didn't name it a language converter.

Other aspects, such as usage methods, I think can be adjusted.

agilgur5 commented 1 month ago

It would be best if users can use it directly from one interface, thus shielding the complexity of using Argo Workflows syntax.

Piping it would just be argogene convert | argo submit, so no real complexity or syntax exposure.

Although I think attempting to fully shield users from the nuances of a different tool is not truly possible and tends to result in confusion on top of significant maintenance overhead. See also: all the users confused by Kubeflow Pipelines' differently named fields than Argo or Kubernetes, plus its missing features from Argo. As a result, knowledge transfer is poor, and knowing KFP does not mean you know Argo etc, which becomes particularly confusing when you're trying to debug anything.

Even in the existing submit command you have, the exposed flags are Argo flags, with their meanings inherited from Argo. Similarly for users to understand the status and such, they'll need to have some understanding of Argo and k8s. For instance, the output is still Argo output with k8s attributes like ServiceAccounts. And they'll definitely need to know more to debug as well.

As such, it's impossible to fully encapsulate the interface -- every encapsulation is leaky. Leaky encapsulations plus tight coupling lead to maintenance difficulties. Instead of leaky and confusing encapsulations, I would suggest transparency, which also helps users gradually learn the underlying tools, which they will need to know to debug anyway.

Other aspects, such as usage methods, I think can be adjusted.

Ultimately, I agree, which is why I still gave a +1.

But as an experienced maintainer, and one who spent a good amount of time attempting to maintain leaky encapsulations too, I've learned the benefits of Unix philosophy the hard way and so strongly encourage folks to not fall into the maintenance pitfalls that occur when you don't follow it.