lapps / vocabulary-pages

DSL files and templates used to generate the LAPPS WS-EV pages.
Apache License 2.0
0 stars 0 forks source link

metadata for views #68

Open keighrim opened 6 years ago

keighrim commented 6 years ago

related to #22/#40, and possibly #50 and #55


The LIF documentation (this section) describes three keys for view metadata (producer, rules, type). While we have all of them well-defined (together with those 4 xTagSet-like keys), we are lacking

  1. usage of rules key in any tool we have wrapped.
  2. specification on how to use type key and context expansion of those values.

My issues:

  1. Can rules be substituted with xTagSet keys? Current description does not look very different from specifying a tag/category/label set used by a tool or in a view.

    The documentation (if any) for the rules that were used to identify the annotations.

  2. Can type be expanded to a URI based on the json file's context, as stated in the documentation?

    The type key is used to specify what kind of token we are dealing with. It allows several tokenizers to specify the same type, for example if two tokenizers are both implementations of the OpenNLP tokenization scheme. In the example here the type key has the compact IRI value tokenization:opennlp, where tokenization refers to the tokenization key in the external context file in http://vocab.lappsgrid.org/context-1.0.0.jsonld ... tokenization:opennlp will be expanded to http://vocab.lappsgrid.org/types/tokenization/opennlp (DOESN'T EXIST!). The rules key inside of Token can be used to specify a rule set, in this case one defined by http://vocab.lappsgrid.org/types/tokenization/opennlp_basic (ALSO DOESN'T EXIST!).

    (parentheses are mine)

  3. What is type for anyway? So currently, a type value is free-text, and can be any string a developer wants to put. To me, the specification reads that the type is for specifying what an actual NLP software is used to create annotations in this view separately from specifying a LAPPS tool (probably a wrapper of the actual tool), for which the producer key is. If that's the case, can we just put a URI of the website of the original software, or its source code repository instead of the a "compact IRI value" separately defined in the context?

keighrim commented 6 years ago

This might be related to #62

keighrim commented 6 years ago

Additionally, we discussed a problem with current implementation of producer key, in that there's no easy way to enforce the value for the producer to be a unique and meaningful identifier. Maybe be a good idea to have a have to automatically generate the producer value from tool metadata (name, version, etc).