Branch: Digital Image File

annabelleee commented 6 years ago

Steps to completion (updated):

[ ] Metadata
[ ] Diagram (using ARM WG notation)
[ ] Sample Data
[ ] Sample Concepts
[ ] JSON of Arches branch based on diagram in ARM WG github repo
[ ] Vote in comments to officially approve branch

azaroth42 commented 6 years ago

Features needed:

Class for the resource: For images, we use E36 Visual Item, as there is no distinguishing feature of E38 Image.
Relationship from other entities to the Image: P138 represents goes to E36 from E1.
The URI of the image is the URI of the E36.
Height/Width: Dimensions with Type for height and with, and a Unit of pixels.
Format: We import dc:format with a value of the content-type.

Then all of the core branches such as label, Name, Description, Identifier, Type, as needed.

CRMDig, as discussed, is mostly about the "steps and methods of production" of the file, not the file itself.

We also use dcterms:conformsTo when the image conforms to some standard, such as being a IIIF image service.

azaroth42 commented 6 years ago

Proposed Model diagram, (without the Name, Description, Identifier branches to make the core features easier to see)

dig_image_model

workergnome commented 6 years ago

format is a string, not a lookup?

azaroth42 commented 6 years ago

Yeah, IANA media types (colloquially MIME types) aren't URIs. Instance data would be:

{
  "id": "http://arches.org/data/images/1.png",
  "type": "VisualItem",
  "label": "Yet another picture of Rob's whiteboard"
  "format": "image/png",
  "dimension": [
    { ... },
    { ... }
  ]
}

azaroth42 commented 6 years ago

@adamlodge The format string seems like the string enumeration datatype that you mentioned at the first face to face? Would be good to create a dropdown of the different image format media types to associate with this, rather than make people type in "image/png" or "image/jpeg"!

azaroth42 commented 6 years ago

Use cases for description pattern:

caption
alt text

Further expanded models:

Include the digital provenance of the image
Difference between abstract image content and serialized files. Many files can be instances of the same content.
long-cuts for the format / conforms_to short-cuts

azaroth42 commented 6 years ago

Consider IANA media types as RDM or string enumeration: https://www.iana.org/assignments/media-types/media-types.xhtml

Habennin commented 6 years ago

Ok, so here's my counter proposition. I think that what is of interest about the digital image is that it's a digital image and not the image itself (the content), so I would really argue for using CRMdig so that one can reference the digital object. Eventually, one would want to find all things that are digital objects via one class and not have to know that some visual images are also actually digital images. (e.g. query returns all pdfs, images, word docs, excel files and so on via the D1 class not D1 and or E36).

I agree about all the basic modelling steps that we have concluded are fundamental (name, type etc.)

What about taking this as an opportunity to improve CRMdig? If we go that way, we could make a subclass for digital image objects that was subclass of D1/E36 (thus getting our representation property back).

We could also start addressing the questions of format that you rightly raise above. It is missing from CRMdig and should obviously be there (as property of D1). dc:format though, I would argue is poorly formulated. The term is fine, but if you read the definition it ends up in exactly the problem we encountered in the call, by format we want to say mime type not the type of the physical thing or its dimensions (e.g. not 35mm, no digital image is 35mm by definition).

I also agree with the need for the conform to property, but could you explain the meaning of it some more? Does it mean something like 'susceptible of being processed/loaded by'? If it did then what is the range of this property? A type of service? A particular service? A software?

I put the partially complete counterproposal in the google sheet. If we go in this direction, I think we need to propose a new digital image class under D1 and come up with two new properties for crmdig proper that allow expressing the missing concepts of format and conforms to but with tighter semantics to avoid confusion.

https://docs.google.com/spreadsheets/d/1EYk1yhhBNWrKbVB0jV_1j8NJ-Egi5zWha7QrVvYvw3s/edit#gid=1807176856

archesproject / ARM_Working_Group

Branch: Digital Image File #23