Open marcverhagen opened 7 years ago
Correct me if I'm thinking to naive, but I think it's okay to return "NamedEntity" as produces
metadata as long as we specify that the output format (discriminator) is something else than LIF. Even though GATE tool returns ORG, PER, ... they are all fall into the category of http://vocab.lappsgrid.org/NamedEntity.html
at least conceptually. Isn't this what the lapps WSEV is about?
No correction from me since I tend to agree with that. But I do want us to think about this a little bit. At the least this needs to be explained well somewhere in the LIF specifications or some other relevant spot.
I think in a simple cases like the Gate NER or Weblicht tokenizer that simply using URLs from the WSEV is the best we can do, even though technically that is not what the services produce. However, I think we are going to have to give more thought to what the produces
means and how it is structured. For example services that create multiple annotations types vs services that create one annotation type from a set of types.
We have always considered the metadata
produces
field to tell us what kind of vocabulary items are generated by a service. So we say vocab.lappsgrid.org/Token assuming that in the LIF structure we have annotations likeBut we say this even for services that do not generate LIF, for example the GATE Named Entity Recognizer. This has recently become more problematic because GATE NER produces Person and Organization and the like, and the vocab does not have those anymore. We could have getMetadata() return NamedEntity instead.
In any case, we may have to think this through a bit more and we have to be at least clear about what
produces
means. In a way, if GATE NER produces NamedEntity annotations then what we really mean is that it produces something (GATE format with Person and Organization types) and that something can be translated into LIF with NamedEntity annotations.This is also an issue for WebLicht services since those do not produce LIF either.