clamsproject / mmif

MultiMedia Interchange Format
Apache License 2.0
5 stars 1 forks source link

recording runtime parameters and runtime configurations #208

Closed keighrim closed 6 months ago

keighrim commented 1 year ago

Currently, in the MMIF json scheme, we have a place to store the runtime parameters.

https://github.com/clamsproject/mmif/blob/79c621f5c3dabb499a0878f30a59ca25b56bbb5d/schema/mmif.json#L93-L95

However, as raised in https://github.com/clamsproject/clams-python/issues/166 , often times apps will use some additional (or corrected/normalized) configuration (as their default behaviors) that are not necessarily passed at the runtime. Ideally, all these default configurations should be retrievable via the app metadata of that exact version, but I wonder if there's any value in adding one (optional) field (e.g., configuration as a sibling to parameters) in the MMIF to record the configuration after the "normalization" (see the linked issue, if you're not sure what this normalization is).

keighrim commented 6 months ago

Re-opening this after a recent discussion (https://github.com/clamsproject/app-swt-detection/issues/84).

Few suggestions from the latest discussion;

  1. we want to keep view.metadata.parameters field for backward compatibility.
    • we discussed re-naming (or aliasing) the parameters field to something else (rawParameter, inputParameters, userParameters were mentioned), but I later realized that will be breaking backward compatibility.
    • Also worth noting that we discussed possibility of "breaking early", if necessary, before we have piles of MMIF outputs with only parameters field.
  2. we want to separate "raw" input parameters from the configuration that's actually used by the app. We call the transformation from parameters -> configuration as refinement of the parameters The refinement has three possible actions, and only these;
    • type casting: since users are passing parameters as strings (via URL or shell), the values need to be properly re-typed based on the parameter specification
    • default value "filling-in": when user didn't pass a parameter value, the default value for the parameter must be used (note that by definition, all parameters with default values are "optional", while others are required parameters)
    • discarding invalid parameters: when user passes an invalid, it should be discarded. A parameter is invalid if 1) it's not defined in the app metadata, or 2) it has choices but the passed value is not one of possible choices. Note that discarded parameters are still going to be recorded as UserWarnings in the output MMIF.
  3. and finally, regarding the name of former refinedParameters, runtimeParameters was newly suggested. However, after reviewing past issues, now I'd like to suggest something that doesn't include string "parameters". view.metadata.appConfiguredAs. Hopefully, this will reduce some confusion of readers.

Again, very open to more suggestions. Let us know what you think.

keighrim commented 6 months ago

More on the versioning: Let's first assume that we use firsthandParameters and secondaryParameters for names for the fields in discussion. Then, let's call the current version as x and the next version with secondaryParameters added as y. We don't know for sure yet the actual values of those undecided variables. But we have pretty good candidates.

Here's some possible enumeration of MMIF generation

  1. x_p = MMIF generated under x (firsthandP == parameters) , no secondary params recorded
    • this is MMIFs we are currently getting
  2. x_p_s = MMIF generated under x (firsthandP == parameters), secondaryP == any of candidates
    • basically, we use old schema, adding secondaryP as an extra
  3. y_p = MMIF generated under y+fP=p (fP == parameters) , no secondary params
    • these are not really likely to happen as long as the app is based on our SDK
  4. y_p_s = MMIF generated under y+fP=p (fP == parameters), sP == any
  5. y_u_s = MMIF generated under y+fP=u (fP == userParameters) , sP == any
  6. y_pu_s = MMIF generated under y+fP=pu (fP == [userParameters or parameters]), sP == any

Then the validation matrix;

valid under x y+fP=p y+fP=u y+fP=pu
x_p valid valid invalid ?valid
x_p_s valid valid invalid ?valid
y_p valid valid - -
y_p_s valid valid - -
y_u_s invalid - valid -
y_pu_s ?valid - - valid

(?-prefix means conditaonally valid ) (Please correct me if I did something wrong anywhere in the above, not so confident I though of all possible combinations of proposals) After this calculation, it's pretty clear to me that we only do a patch number bump unless we wan't to change the name of existing parameters field. In fact, we don't really have to update our json schema and bump MMIF version to support a secondary parameters field in the view metadata.

BTW, this is exactly one of the example scenarios for patch level bump from the previous discussion;

An addition of a field where additionalProperties=true: a newer input might have that new field, it was optional in the past anyway. Input is valid.

(I guess I didn't know that view.metadata has additionalProperties=true by default when I was suggesting 1.1.0 in #211 or here.)