dasch-swiss / dsp-api

DaSCH Service Platform API
http://admin.dasch.swiss
Apache License 2.0
74 stars 18 forks source link

Preservation Metadata #843

Open subotic opened 6 years ago

subotic commented 6 years ago
lrosenth commented 6 years ago

This has to be done for each version of a resource – as we have versioning…

Am 03.05.2018 um 16:21 schrieb Ivan Subotic notifications@github.com<mailto:notifications@github.com>:

We need to calculate and store fixity informationhttps://www.dpconline.org/handbook/technical-solutions-and-tools/fixity-and-checksums for Knora resources. This is needed for the data repository side of Knora, so that we are able to check and prove that resources were not changed.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/dhlab-basel/Knora/issues/843, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AFN9zJJyh5VZ1rlHQHRUfBBFupfhRoAaks5tuxJWgaJpZM4TxLRo.

benjamingeer commented 6 years ago

But we don't have versions of resources, only versions of values.

subotic commented 6 years ago

But we don't have versions of resources, only versions of values.

Yes, we don't have explicit versions of resources, but implicitly (I think), any change to a value creates a new version of a resource.

Yesterday, I had a long conversation with @lrosenth. This is the summary in very broad strokes. This is just a first broad draft and we still need to discuss if it is feasible:

subotic commented 6 years ago

I'm not sure, how this will work (if at all) if we make changes to the data model and need to change the data.

Ok, now I'm definitely sure that this will not work. Any change to the data model that requires changes to the data, will render all checksums invalid.

@lrosenth Do we need to make our life so hard and try to build a system that is at the same time a VRE and a Long-Term Data Archival Repository? Can't we separate those two? Basically, have an additional layer, which is read-only that stores the data and the checksums on every change, but allows us to recreate the repository for any point in time? Basically a "backup on steroids" solution. That way we could do whatever is needed for running the VRE in the upper VRE layer while being able to preserve any changes in the lower Repository layer.

subotic commented 6 years ago

We also don't need to reinvent the wheel in regards to the data model for preservation metadata. The Library of Congres has a well-established standard called PREMIS for which they also have an OWL ontology.