Open phochste opened 2 years ago
I see that PREMIS is deliberately vague too but explains this vagueness:
Event contains the identifier of the Object involved. What is important is that this association is arbitrary and is not meant to imply that a particular implementation is required. The choice of semantic unit is down to individual implementations.
In some cases a semantic unit takes the form of a container that groups a set of related semantic units. For example, the semantic unit identifier groups the two semantic units identifierType and identifierValue. The grouped subunits are called semantic components of the container. Some containers are defined as extension containers, to allow the use of metadata encoded according to an external schema. This enables PREMIS to be extended with metadata elements that are more granular, non-core, or otherwise out of scope for the Data Dictionary.
The question: is our definition deliberately made vague to accommodate all these use cases (and corallary is it vague enough in this regard), or do we really have a more formal understanding what an artefact is (and what it is not).
Yes, deliberately vangue, and no, I don't think we need a formal understanding beyond 'it need to be identifiable'. The reasoning is that our network should really be able to consider artefacts as black boxes. If not, a decent level of scalable interop will be hard to achieve. And since it doesn't matter to the network what the artefact is, you can use its components for a complex object, a file, a fragment, ... however...
It is quite possible that what an artefact is depends on the use case.
... the use case should probably be more specific about what the possible artefacts are.
If you just get a reference, then it the name of the artefact in some pod that you can dereference. Using dereferencing one can learn more about it:
* Is it a complex object * Is it a versioned object * Is it a fragment * Is it a particular representation of an object
Yep! But then we are moving beyond the scope of this project, or at least this 'generic base'.
We are not going to solve the problem of dealing with complex objects, but need a bit clearer what artefact can mean in our specs.
That's definitely a good idea. We can give concrete examples for the use cases
Currently in Overview the artefact is defined as:
While this text is clear in our colloquial usage of the term in our discussions, it makes the exact understanding of this term in light of lifecycle events and the possibility of complex object open to interpretation. Even in our internal communications, the artefact sometimes means the PDF file, sometimes the PDF + metadata file, sometimes the landing pages (which is assumed to have the semantics ,e.g. Signposting, that makes clear what is the composition of the complex object artefact).
E.g. when archiving an artefact is it clear what this means for a single File / Bitstream and to a lesser extend the Representation of the artefact. But, in an archival context this becomes a bit of a slippery slope when talking about complex objects.
E.g. in PREMIS meaning the
artefact
under consideration is something else then the smallest divisable unit on the network. They are talking about an Intellectual Entity: thatIn context of interaction events (e.g. annotation of artefacts) the object of interaction can be a fragment of what we have in mind as indivisable artefact.
E.g. in Web Annotation the target (what would also be like the artefact in our case):
The question: is our definition deliberately made vague to accommodate all these use cases (and corallary is it vague enough in this regard), or do we really have a more formal understanding what an artefact is (and what it is not).
It is quite possible that what an artefact is depends on the use case. If you just get a reference, then it the name of the artefact in some pod that you can dereference. Using dereferencing one can learn more about it:
We are not going to solve the problem of dealing with complex objects, but need a bit clearer what artefact can mean in our specs.