Closed m1ci closed 7 years ago
One idea to make this more simple: We could model the information which dataset or model was used to produce an annotation via the tool that produced the annotation. The tool is specified in #104. For example if a service exposes named entity recognition with two different models, this can be considered as two different tools. Someone can decide to host additional information about these tools (e.g. that they are exposed by the same service) in a triple store if this information is really needed.
I further suggest to make this optional. to lower the entrance barrier to NIF. When When a developer of a NIF service wants to give detailed provenance information, it is a good idea to have a standard on this. Forcing developers to support all provenance information might scare off programmers because it is quiet complicated.
Although we know from the shape of URI dbpedia:Diego_Maradona that it's a DBpedia resource
It is possible also that the link was produced by another dataset which uses DBPedia identifiers.
provenance discussion is over
@m1ci In your mail you detailed the question of this issue as:
To me there are two separate aspects/variants of interpretation for 'used to generate':
details for interpretation of link targets
Sticking with the football example:
Although we know from the shape of URI
dbpedia:Diego_Maradona
that it's a DBpedia resource, we cannot tell whether the "traget namespace" for the linking service was the most recent release of DBpedia, a previous release or even a customized compilation of DBpedia dataset downloads. To make this clear we would need a concept like a "scoped resource identifier" (just a tentative idea on top of my head):training/parameterisation dataset information
This information seems to fit most naturally as an additional piece of information for the
prov:SoftwareAgent
resource used for provenance:If one wanted to simplify, one could consider
freme:trainingDataset
,freme:targetDataset
properties that could be directly attachted to theprov:SoftwareAgent
description. However, these new vocab items would be out of scope for NIF, in my opinion and would need to become part of a new ontology 'missing bits and pieces for FREME' ;-)