clamsproject / clams-python

CLAMS SDK for python
http://sdk.clams.ai/
Apache License 2.0
4 stars 1 forks source link

input specification in app metadata #50

Closed keighrim closed 3 years ago

keighrim commented 3 years ago

This thread to discuss the design of output of apps as specified in their appmetadata, particularly in the requires field.

marcverhagen commented 3 years ago

Some notes, partially based on discussions of the past two weeks.

As with output specifications, current values are lists of types:

Application Requires
segmenter [AudioDocument]
tesseract [ImageDocument, VideoDocument]
east [ImageDocument, VideoDocument]
slatedetection [VideoDocument]
kaldi [AudioDocument]
segmented-kaldi [AudioDocument, TimeFrame]
spacy [TextDocument]

The following are things we could require of the input:

  1. Input elements need to be of a particular document type or annotation type (BoundingBox).
  2. Input elements require a particular property-value pair (boxType=text)
  3. Input has to be output from a particular app.

We decided to ditch the third. If needed, we can enforce a requirement like that in the pipeline itself, if you really want to run Tesseract on EAST output, just build an EAST-Tesseract pipeline

Other issues:

  1. Do we represent the requirements as a Python datastructure or a JSON file? I wanted to do the second and use metadata.json and then read it into the app on initialization, but as an alternative I used a Python module. If we decide to use a dedicated file we should probably standardize.
  2. Do we interpret the list as a conjunction or disjunction? I think for now we should leave this an open issue. This does probably mean that the requires metadata cannot give a full picture.
  3. What about required versus optional?
  4. What about requirements specified by parameters? For example, Tesseract could be run with use-timeframes=True as a parameter.

We seemed to agree on two things, for now:

  1. To be more expressive the list needs to contain objects, not just type URLs. In LAPPS we used discriminators, which were strings that could stand in for feature-value pairs.
  2. We may not be able to make the list so expressive that we can encode all requirements in it. This is a liberal interpretation of the requires property, basically saying that it gives a broad hint.

Tesseract Example.

What Tesseract really needs is a video document, but it could be restricted to just using TimeFrames of a particular type and/or bounding boxes of particlar types. You could see "Tesseract requires video or time frames" as a disjunction, if you run on time frames you can get the video document from there. But in another sense the video is always required and the time frame is optional.

The simplest specification would be like what we have now in the metadata (just using short strings to stand in for the URLs in the vocabulary):

"requires": ["VideoDocument", "TimeFrame", "BoundingBox"]

To add the required versus optional distinction, we have three options:

  1. Making no distinctions and leave it up to the app's documentation or some human readable description in the metadata.
  2. Specifying that the first is required and all others optional. This works for many tools and often the first required element will be a document type and the optional elements annotation types. But it is easy to imagine tools that do have more required input, for example a standalone Tesseract that needs both video and text boxes or a parser that needs tokens and pos.
  3. Introduce objects.

Wtith objects the requires metadata could look as follows:

"requires":  [
    { "@type": "VideoDocument", "required": True },
    { "@type": "TimeFrame", "required": False },
    { "@type": "BoundingBox", "required": False } ]

To specify property values we can just add to the above:

"requires": [
    { "@type": "VideoDocument", "required": True },
    { "@type": "TimeFrame", "required": False },
    { "@type": "BoundingBox", "required": False, "boxType": "text" } ]

Some questions here are

Some other issues we touched upon:

When an app runs

By default it checks the requirements as stated in the metadata. It checks whether there are any views that meet those requirements. If there are none, return input MMIF with error/warning in the added app's view

If the tool has some specifics, say for the sake of argument that Tesseract either wants a time fram or bounding box (which is not explicit in the metadata) then is coded up in the app and documented somwehere (I would say in the READEM.md in the repository). Again, if no views match, return input MMIF with error/warning in the added app's view

If the tool has parameters then it needs internal logic to determine requirements from there. Part of this may be documented in the parameter metadata, some extra info may need to be in the READE file. Again, return error/warning if needed.

Note that the tool may also need to calculate what kind of output it creates given the metadata and the parameters.

If all requirements are met the tool will run succesfully and not issue warnings even if no output was generated.

keighrim commented 3 years ago

Wow, thanks for taking notes on this. There're lots of big questions, but I'd like to first throw my thought on smaller ones.

  1. Making no distinctions and leave it up to the app's documentation or some human readable description in the metadata.
  2. Specifying that the first is required and all others optional. This works for many tools and often the first required element will be a document type and the optional elements annotation types. But it is easy to imagine tools that do have more required input, for example a standalone Tesseract that needs both video and text boxes or a parser that needs tokens and pos.
  3. Introduce objects.

I too vote on 3. Although the structure of the object is still in question (and that's a big one).


  • What if "required" is a property on an annotation type? If we allow for that we should either introduce more structure with "required" and "properties". Or rename "required" into #required, $required or __required and specify that property names should be alphanumeric with underscores and dashes and start with a letter.
  • We may want to allow list values like "frameType": ["slate", "credits"].

We can separately specify if there are two or more prop-level optional input types, we don't need to worry about the prop-level "required" value. By prop-level, I'm referring to input types with property values. For example,

    { "@type": "TimeFrame", "required": False },

is type-level, where

    { "@type": "TimeFrame", "required": False, "frameType": "slate" },

is a prop-level specification. So, with that, we can roll out the list of property values as separate "type"s as in

    { "@type": "TimeFrame", "required": False, "frameType": "slate" },
    { "@type": "TimeFrame", "required": False, "frameType": "credits" },

We can restrict the scope of required to specific property K-V pairs.


  • I don't like "requires" and "required" so close together. And I was already somewhat put off by the names "requires" and "produces" anyway. Maybe we can use "input" and "output"?

I like input, output. Much clear. We also need to decide between optional and required.


By default it checks the requirements as stated in the metadata. It checks whether there are any views that meet those requirements. If there are none, return input MMIF with error/warning in the added app's view

When wrapped in a HTTP app, if an app returns a regular MMIF string even when it errors out, we have a very limited (and probably quite complicated) number of ways to properly wrap it in a HTTP response. I'd like to suggest that the python app just raises an error (preferably with meaningful msgs). In my working branch (https://github.com/clamsproject/clams-python/issues/36, current code) the clams-python SDK will create a MMIF string based on the python error and returns with 4xx response code.

marcverhagen commented 3 years ago

We reached an agreement:

Here is what we will do:

"input": [
    { "@type": "TimeFrame", 
      "required": True,
      "properties":  { "required": False, "frameType": "slate" }}
]

With this there is no conflict between the upper level "required" and the (rather unlikely) property in the input. This is almost the same as an annotation or document.

In the simples case this is not that far from what we have now:

"input":  [{ "@type": "TimeFrame" }]

At our leisure we can define abbreviations because this is all Python:

VIDEO =  { "@type": "VideoDocument"} 
SLATE = { "@type": "TimeFrame", 
                 "required": True,
                 "properties":  { "frameType": "slate" },
}

"input": [VIDEO, SLATE]

For output we can use something similar, except that property values will not be specific strings or lists of strings but types:

"output": [ 
    { "@type": "TimeFrame",
      "properties":  { "frameType": str }}
]
marcverhagen commented 3 years ago

On the error proposal (since this confused me) we also agree.

The _consume() method of the app returns an error, but the web services returns a MMIF string but also a 4XX error code.

keighrim commented 3 years ago

From a new development on the versioning issue (https://github.com/clamsproject/mmif/issues/14, https://github.com/clamsproject/mmif-python/issues/163), now input field cannot simply list version-specific @types to accurately express backward (and possibly forward) compatibility with other versions of the same @types. Here are some ideas;

  1. use a wild card character to indicate the compatibility. (*, ?, x, X, etc.) http://.../1.0.*/TimeFrame is compatible with anything from .../1.0.0/TimeFrame to .../1.0.999⋯.
  2. omit the patch (or patch + minor) number from its string representation. http://.../1.0/TimeFrame is compatible with anything from .../1.0.0/TimeFrame to .../1.0.999⋯.
  3. just leave the full version in the string but document that @types in the input list must be considered with their compatible other version (not a good solution, I think)
marcverhagen commented 3 years ago

I am not sure how much of an issue this should be. In the input I usually do something like

[{'@type': AnnotationTypes.Timeframe.value}]

Which means that any version comes from a default inside of the mmif-sdk, basically the current version. I am not wild about having to say

[{'@type': 'http:/mmif.clasm.ai/VERSION_SPEC/Timeframe'}]

Using a wildcard in there or part of the version makes this a non-existing URL, which I don't like. An app assumes a particular version of MMIF, and should just also assume that its input uses that version of MMIF. The platform and the pipeline should take care of anything outside of that in case an upstream component uses an older version of MMIF:

What I am saying is that we should not put stuff in the input that does not fit there (like version ranges), but experiment with what happens when we build a pipeline with out-of-date upstream tools or run an app over old MMIF.

Of the three options above this is closest to 3 I am afraid.

keighrim commented 3 years ago

I got your point. And I fully agree that having non-existing URLs shown in the MMIF is not good. Previously when I said,

just leave the full version in the string but document that @types in the input list must be considered with their compatible other version (not a good solution, I think)

the main reason I thought it wasn't a good solution was because of lack of transparency when you put the full URL there. For example, when an app says it can take a input of http://.../1.0.3/vocab/TimeFrame, it can actually take inputs like http://.../1.0.0/vocab/TimeFrame, http://.../1.0.2/vocab/TimeFrame, or maybe even http://.../1.0.13/vocab/TimeFrame.

I wasn't thinking clearly on the issue with non-existing URLs at the time, and now I believe having full version specified in the input list isn't that bad solution compared to the others.

marcverhagen commented 3 years ago

Yeah, potential lack of transparancy is indeed an issue. But it is somewhat hidden from the user because they do not put in a full URL but just a type from the annotations type list.

keighrim commented 3 years ago

closed via #53 .