viash-io / viash

script + metadata = standalone component
https://viash.io
GNU General Public License v3.0
39 stars 2 forks source link

Overriding configs #28

Closed rcannood closed 3 years ago

rcannood commented 3 years ago

When working with a large collection of viash components, a user might want to override the 'version' value of all components with a simple command or file. @tverbeiren and I have been arguing back and forth on how to be able to override certain settings in viash configs. This issue discusses some of the possible ways of doing this, how to implement it, and the impact on the maintainability of the codebase.

Let's use the following config as an example:

# content of src/component1/config.vsh.yaml
functionality:
  name: foo
  namespace: bar
  resources: [ { type: executable, path: ls } ]
platforms: 
  - type: docker
    image: "bash:4.0"
    setup: [ { type: r, cran: princurve } ]

Option 1: Override settings with yaml

by providing something like:

viash ns build --setting 'functionality: { version: "0.3.0" }'

Alternatively, there might be a file in the root directory, e.g. a yaml file, which allows the user to do the same thing with just running viash ns build.

# content of global.yaml
functionality:
  version: "0.3.0"

Ambiguities arise when wanting to merge platforms, however (or any type of list, for that matter). For instance, if you want to modify a setting to the current platform, you would need to pass the following yaml:

platforms:
  - type: docker
    target_registry: itx-aiv

To add a docker platform:

platforms:
  - type: docker
    id: newplatform
    image: foo

However, to replace the docker platform with something else is not possible using this syntax.

Possible implementation Assuming the following scala class: ```scala case class Author(name: String, age: Integer, weight: Double) case class Functionality( name: String, namespace: Option[String] = None, version: Option[String] = None, authors: List[Author] = Nil ) ``` You could adapt this code to something like this: ```scala implicit class RichObject[X](x: X) { def `|`(y: X): X = { if (x == null) y else x } } trait Author { name: String age: Option[Integer] weight: Option[Double] def merge(o: Author): Either[ParsingFailure, Author] = { if (o.name == null) { Left(ParsingFailure("Name of RHS author is null")) } else if (name != o.name) { Left(ParsingFailure("Name of LHS and RHS authors are not equal")) } else { Right(DefaultsAuthor( name = name | o.name, age = age | o.age, weight = weight | o.weight )) } } } case class EmptyAuthor( name: String = null, age: Option[Integer] = null, weight: Option[Double] = null ) extends Author case class DefaultsAuthor( name: String, age: Option[Integer] = None weight: Option[Double] = None ) extends Author trait Functionality{ val name: String val namespace: Option[String] val version: Option[String] val authors: List[Author] def merge(o: Functionality): Either[ParsingFailure, Functionality] = { Right(DefaultsFunctionality( name = name | o.name, namespace = namespace | o.namespace, version = version | o.version, // need to implement object which can merge lists of mergable objects by checking whether // the 'name' or 'id' fields of objects are similar... or by checking whether you get parsingfailures authors = ListMerger.merge(authors, o.authors) )) } } case class EmptyFunctionality( name: String = null, namespace: Option[String] = null, version: Option[String] = null, authors: List[Author] = null ) extends Functionality case class DefaultsFunctionality( name: String, namespace: Option[String] = None, version: Option[String] = None, authors: List[Author] = Nil ) extends Functionality ```

Good:

Bad:

Option 2: Edit yaml with yq

An alternative approach would be to let the user specify yq queries, given that yq is installed. This would offer low-level control but requires the user to know a lot about yq.

You could add these queries to a globals.yq file:

(.functionality.version) |= 0.3.0
(.platforms[] | select(.type == "docker") | .target_registry) |= "itx-aiv"

Or provide them on the command line:

viash ns build --yq '(.functionality.version) |= 0.3.0'

Additionally, it'd be best to provide a command that allows printing what the resulting config will be.

viash config view config.vsh.yaml --yq '(.functionality.version) |= 0.3.0'

Even better, you could create a command that allows editing the yaml!

viash config edit config.vsh.yaml --yq '(.functionality.version) |= 0.3.0'
Possible implementation Require users to have yq installed and perform a system call to yq.

Good:

Bad:

Option 3: Allow config modifiers

viash build foo/config.vsh.yaml --modify_config yqmod --modify_config_args '(.functionality.version) |= 0.3.0' -o bin

Option 4: Create own DSL

viash build foo/config.vsh.yaml 'platforms.docker.target_registry: itx-aiv' -o bin
viash build foo/config.vsh.yaml 'platforms.[type ~= "docker.*"].target_registry: itx-aiv' -o bin
viash build foo/config.vsh.yaml 'platforms.[type == "docker"].target_registry: itx-aiv' -o bin
rcannood commented 3 years ago

In the end, Toni and I decided to try out the DSL approach, as it would offer the most flexibility with a (relative) intuitive interface.

Example commands:

# een simpele set
.functionality.version := "0.3.0"

# een subset van een lijst aanpassen:
.platforms[.type == "docker"].container_registry := "itx-aiv"

# iets aan een lijst toevoegen:
.functionality.authors += { name: "Mr. T", role: "sponsor" }

# reference root of yaml:
.platforms[$.version != .version].foo := "bar"

# future work, index according to '.id' field or list index
.platforms[.id == "docker"].container_registry := "itx-aiv"
.platforms["docker"].container_registry := "itx-aiv"
.platforms[0].container_registry := "itx-aiv"

# future work, filtering with integers/doubles
.platforms[.version > 10].foo := "bar"

# future work, support multiline json
.functionality.authors += {
  name: "Mr. T",
  role: "sponsor"
}

BNF notation:


<command>    ::= <path> ":=" <json>
               | <path> "+=" <json>

<path>       ::= "$" | <path2>
<path2>      ::= ""
               | <path2> "." <identifier>
               | <path2> "[" <condition> "]"

<condition>  ::= <value> "==" <value> 
               | <value> "!=" <value> 
               | <condition> "&&" <condition> 
               | <condition> "||" <condition>
               | "(" <condition> ")"
<value>      ::= <path> | <string> | <number> | <json>

<identifier> ::= <letter> (<letter> | <digit> | "_")
<string>    ::= "'" <text1> "'" | '"' <text2> '"'
<text1>      ::= "" | <character1> <text1>
<text2>      ::= '' | <character2> <text2>

<letter>     ::= [a-zA-Z]
<digit>      ::= [0-9]
<character1> ::= [^\"] | '\"' | "\"
<character2> ::= [^\'] | "\'" | "\"
<number>     ::= # todo, support whole and real numbers, also scientific notation

<json>       ::= <jarray> | <jobject>
<jvalue>     ::= <string> | <number> | <json>
<jarray>     ::= "[" (<json> ("," <json>)*))? "]"
<jobject>    ::= "{" (<jprop> ("," <jprop>)*))? "}"
<jprop>      ::= <jname> ":" <jvalue>
<jprop>      ::= <identifier> | <string> 

I still need to define a BNF for <json>.

Implementation of this feature has started in branch feature/dsl.

rcannood commented 3 years ago

Solved by #30.