Open chin-rcip opened 4 years ago
When looking at the AAT data, I found that they also uses PROV-O to track the creation and modifications of each Concepts.
They chose 2 simultaneous patterns, with PROV-O Refinements (https://www.w3.org/TR/2013/NOTE-prov-dc-20130430/#term_modified) and Dublin Core.
By looking at their approach, it seems I made a mistake in my earlier proposition. I've linked the creation Event to one E73 Information Object, and the Modification to a different E73, even if it is the same E73 (either Named Graph or Record).
Following what the Getty did with the AAT, I would propose the following:
I am not sure if the property between the prov:Modify and the E73 is prov:wasGeneratedBy. In the documentation, it seems it should be that, but it seems a bit strange to me.
Notes on verbal meeting 2020-02-17
It might be sufficient for what we are meaning to track.
Made a distinction between a persistent resource and a volatile dynamic resource (e.g. making a software program with versions) you want to be able to refer to each version (e73 information object but is not ex nihilo) but everything is linked back to the volatile object (the software program). Will we keep copies of chunks of metadata or do we only want to know when was the last update.
When wanting to have all the information about a specific named graph, do you need a link between all the information objects?
The museums will need the older versions of their data at some point for sure. If it has to be in the LOD environment is another question.
The named graph could be online with the modified events and the copies of data are in our repositories. Or all could be documented in the model with the multiple e73 instances. This will multiply triples as well.
Illip. Museums will need to query older versions.
Habennin. will put links to Parthenos. How to manage massive integration long term; can create meta metadata for the data that we are transforming. Would require a separate triplestore with pointers and policies to handle them.
In the Getty vocabs we went for simplicity foremost, because PROV is quite complicated, e.g see http://vocab.getty.edu/doc/#dct_modified
This is a complex topic, so here are just a few considerations:
Thank you @VladimirAlexiev for your input.
ProvO is indeed a bit complicated and creates a lot of triples, but it's quite similar to CIDOC CRM. Would it be an option to both have ProvO and dct?
For the question of where to have the named graph, I have created the issue #45. I would very much like your input on that important subject.
During our latest discussion (on the 23th of March), we came to the conclusion that it would be best not to publish the older version of the datasets, and to store them in a repository at CHIN (and available if someone asked them for historical purposes).
We need to investigate those implementations, thank you for the information!
General question: Does my pattern proposed on the 15h of January make sens?
Regarding #45, we have decided to go with a Named Graph per dataset. We know need to identify clearly the updating process.
During our Semantic Committee meeting on the 2021-01-07, while we were discussiing Issue #10, the update came up since in some use cases, keeping track of more than two roles (creator and provider) could be necessary in order to offer the possibility of documenting updates done, for instance, by an artist regarding his/her data in the museum's dataset.
This highlights the need for having two "categories" of updates:
Update of records
In the v.1.5 of the Target Model, the history of record (
E73 Information Object
) in only modeled by theE65 Creation
event, and there is no possibility to document the history of the different versions of this record, which is a problem.With CIDOC CRM
With CIDOC CRM, there is no way to render those updates, as the
E11 Modification
class refers only to the modification ofE24 Physical Man-Made Thing
.With Prov-O
The Prov-o ontology, used to describe the named graph, can also be used to document the update of the entity. with the property
prov:wasRevisionOf
it creates a link between the creation version and the updated version of the record.With CIDOC CRM-Dig
With CRM-Dig, if we instantiate the record and named graph into digital object, we could add the event of modification. Nonetheless, that would create 2 entities for the record and named graph, the original one and the modified one. I'm not sure it would be the best way to model it.
Named Graphs
The Named Graphs generation are documented with the
prov:Activity->prov:generated->prov:Entity(Graph)
. But we could also document the creation and modification of the whole graph with a similar pattern that for the record. Do we need to document the updates of the Named Graph though?