arcalot / arcaflow-engine

Arcaflow is a highly-portable workflow engine enabling modular and validated pipelines through containerized plugins.
https://arcalot.io/arcaflow/
Apache License 2.0
6 stars 9 forks source link

Version and flags #182

Closed dustinblack closed 1 week ago

dustinblack commented 1 month ago

Changes introduced with this PR

Some usability and help output improvements

Note that the flag package accepts either - or -- ahead of all parameters, but in the help output I used more Linux-usual single - for shorthand param names and double -- for full param names.

The flag package also has a function to auto-generate the usage output, but I wasn't satisfied with the way it formatted it, so I stuck with the manual string definition.


By contributing to this repository, I agree to the contribution guidelines.

dustinblack commented 1 month ago

Maybe I'm a curmudgeon, but it's always bothered me that the command flags don't follow GNU conventions. If we stick with flag, I can concede the change for single-letter alternatives and just do the other small code cleanups.

An alternative is to use an argparse package that is modeled after the Python module, but we've had a pretty firm stance in the past about avoiding external dependencies, and that package isn't exactly widely adopted.

dbutenhof commented 1 month ago

Maybe I'm a curmudgeon, but it's always bothered me that the command flags don't follow GNU conventions. If we stick with flag, I can concede the change for single-letter alternatives and just do the other small code cleanups.

I don't have a strong religious affiliation with "GNU" conventions. Hey, I "grew up" with DCL, and then adopted to BSD UNIX way before GNU invented long option names. There are all sorts of variations, some packages allowing unique abbreviations, for example, and a wide divergent practice about whether to use both UNIX single-character options and GNU long options.

What bugs me here is that we're doing "short long options" ... with meaningless single character options using the GNU long option syntax which disallows combining options. (And Google's "we don't need no stinking double dash" arrogance doesn't help, either, but that's almost beside the point.)

An alternative is to use an argparse package that is modeled after the Python module, but we've had a pretty firm stance in the past about avoiding external dependencies, and that package isn't exactly widely adopted.

Yeah, I hear: but an external package that gives reasonable standard behavior instead of the ugly internal non-standard behavior? That, to me, is a harder call. It's almost a bug fix...

webbnh commented 1 month ago

I recommend dropping all the single-character options: they are not expressive, and they cannot be combined together in the traditional fashion, so they violate the principle of least surprise.

If people are really concerned about typing an extra hyphen and a few more characters, then lets provide good default values so that they don't have to type anything at all.

dustinblack commented 1 month ago

I've removed the single-character flags and conceded to using the flag package's built-in usage function. However, this work has revealed an existing problem with the inputs. The current help language implies that both the -input and the -config parameters should be optional, but in reality excluding either of these results in a failure. My golang skills aren't really good enough tackle this well, so I'd appreciate help with addressing this problem as part of this PR.

mfleader commented 1 month ago

Maybe I'm a curmudgeon, but it's always bothered me that the command flags don't follow GNU conventions. If we stick with flag, I can concede the change for single-letter alternatives and just do the other small code cleanups.

An alternative is to use an argparse package that is modeled after the Python module, but we've had a pretty firm stance in the past about avoiding external dependencies, and that package isn't exactly widely adopted.

Yeah that's a golang feature, not a bug. Many (maybe most) default to this non-GNU convention, even though it's not required.

dustinblack commented 2 weeks ago

I've updated the help language a bit more, and I've added a default configuration so that the -config option is now truly optional.

I'm still not entirely satisfied. The -context parameter seems to have no practical function for the user. The help language implies that the workflow will be loaded from that directory, but in fact the paths for all of the inputs are relative to the current working directory.

webbnh commented 2 weeks ago

The -context parameter seems to have no practical function for the user. The help language implies that the workflow will be loaded from that directory, but in fact the paths for all of the inputs are relative to the current working directory.

If the workflow were installed like some utility application (or some analogous personal configuration), then as a user I would use -context to point to the installation location of the workflow and its configuration and I would expect/desire the input file to be loaded (by default) from my current directory.

As a workflow writer, I would probably prefer the reverse configuration: I want Arcaflow to run the workflow that I just wrote in my current directory, and I want to apply the workflow to an input file which is located in the output directory of some job that I just ran (i.e., elsewhere) -- so having the workflow pulled from the context which defaults to my current working directory while the input is pulled from an absolute path would be exactly what I want.

dustinblack commented 2 weeks ago

The -context parameter seems to have no practical function for the user. The help language implies that the workflow will be loaded from that directory, but in fact the paths for all of the inputs are relative to the current working directory.

If the workflow were installed like some utility application (or some analogous personal configuration), then as a user I would use -context to point to the installation location of the workflow and its configuration and I would expect/desire the input file to be loaded (by default) from my current directory.

As a workflow writer, I would probably prefer the reverse configuration: I want Arcaflow to run the workflow that I just wrote in my current directory, and I want to apply the workflow to an input file which is located in the output directory of some job that I just ran (i.e., elsewhere) -- so having the workflow pulled from the context which defaults to my current working directory while the input is pulled from an absolute path would be exactly what I want.

So you think the input should in fact not use the context directory?

dustinblack commented 2 weeks ago

I took another look at the readme, considering that the engine is likely one of the primary entry points to the project and the one likely to garner the most stars and github-directed attention over time. With that in mind, I added an updated description of Arcaflow as a whole, a logo, a gif animation of running a workflow, and some additional links for more info. I also removed all references to the Python deployer except the last one because I think at this time it really only deserves footnote status because of its limited use case.