Item - Githubissues

Discussion around the modelling of item, using the FIAF Cataloguing Manual as primarily source.

Item Elements

Define the item class.

<https://fiafcore.org/ontology/Item> a owl:Class ;
    rdfs:label "Item"@en ;
    dc:source "FIAF Cataloguing Manual 3.0"^^xsd:string .

3.1.1 Identifier 3.1.1.1 Identifier Type

At this level identifier is less likely correspond to an external resource, rather internal archival ids.

<https://fiafcore.org/ontology/hasIdentifier> a owl:ObjectProperty ;
    rdfs:label "Has Identifier"@en ;
    dc:source "FIAF Cataloguing Manual 3.1.1"^^xsd:string ;
    rdfs:domain fiaf:Item ;
    rdfs:range fiaf:Identifier .

3.1.2 Title 3.1.2.1 Title Type

<https://fiafcore.org/ontology/hasTitle> a owl:ObjectProperty ;
    rdfs:label "Has Title"@en ;
    dc:source "FIAF Cataloguing Manual 3.1.2"^^xsd:string ;
    rdfs:domain fiaf:Item ;
    rdfs:range fiaf:Title .

3.3.1 Agent(s) 3.3.1.1 Agent Activity

<https://fiafcore.org/ontology/hasActivity> a owl:ObjectProperty ;
    rdfs:label "Has Activity"@en ;
    dc:source "FIAF Cataloguing Manual 3.3.1"^^xsd:string ;
    rdfs:domain fiaf:Item ;
    rdfs:range fiaf:Activity .

3.1.7 Notes

As with work/variant and manifestation all elements terminating in text blocks have been removed.

3.3 Relationships

Relationships are explicitly expressed elsewhere.

3.3.2 Events

<https://fiafcore.org/ontology/hasEvent> a owl:ObjectProperty ;
    rdfs:label "Has Event"@en ;
    dc:source "FIAF Cataloguing Manual 3.3.2"^^xsd:string ;
    rdfs:domain fiaf:Item ;
    rdfs:range fiaf:Event .

3.3.3 Other Relationships

Horizontal item relationships not currently supported.

3.1.3 Holding Institution

Manual indicates text, but institution should be an entity. Also generalise range as institution so that these entities can be reused in another context.

<https://fiafcore.org/ontology/hasHoldingInstitution> a owl:ObjectProperty ;
    rdfs:label "Has Holding Institution"@en ;
    dc:source "FIAF Cataloguing Manual 3.1.3"^^xsd:string ;
    rdfs:domain fiaf:Item ;
    rdfs:range fiaf:Institution .

3.1.4 Element Type

Possibly remove element type in favour of making the item subclasses?

<https://fiafcore.org/ontology/hasElementType> a owl:ObjectProperty ;
    rdfs:label "Has Element Type"@en ;
    dc:source "FIAF Cataloguing Manual 3.1.4"^^xsd:string ;
    rdfs:domain fiaf:Item ;
    rdfs:range fiaf:ElementType .

3.1.5 Item Physical-Digital Description 3.1.5.1 Carrier Type 3.1.5.1.1 Carrier Type: General 3.1.5.1.2 Carrier Type: Specific

General and specific carrier types should be converted into a taxonomy of formats which are used at both this and manifestation level.

<https://fiafcore.org/ontology/hasFormat> a owl:ObjectProperty ;
    rdfs:label "Has Format"@en ;
    dc:source "FIAF Cataloguing Manual 3.1.5"^^xsd:string ;
    rdfs:domain fiaf:Item ;
    rdfs:range fiaf:Format .

3.1.5.3 Sound 3.1.5.5 Sound System 3.1.5.4 Sound Channel Configuration

<https://fiafcore.org/ontology/hasSoundCharacteristic> a owl:ObjectProperty ;
    rdfs:label "Has Sound Characteristic"@en ;
    dc:source "FIAF Cataloguing Manual 3.1.5.3"^^xsd:string ;
    rdfs:domain fiaf:Item ;
    rdfs:range fiaf:SoundCharacteristic .

3.1.5.6 Colour

<https://fiafcore.org/ontology/hasColourCharacteristic> a owl:ObjectProperty ;
    rdfs:label "Has Colour Characteristic"@en ;
    dc:source "FIAF Cataloguing Manual 3.1.5.6"^^xsd:string ;
    rdfs:domain fiaf:Item ;
    rdfs:range fiaf:ColourCharacteristic .

3.1.5.7 Unit Number

Expressed as extent.

3.1.5.8 Extent

This should encompass both unit counts (with type Reels, Rolls, etc) and durations (with type Minutes, Hours).

<https://fiafcore.org/ontology/hasExtent> a owl:ObjectProperty ;
    rdfs:label "Has Extent"@en ;
    dc:source "FIAF Cataloguing Manual 3.1.5.8"^^xsd:string ;
    rdfs:domain fiaf:Item ;
    rdfs:range fiaf:Extent .

3.1.5.9 Projection Characteristics

Currently renamed as image characteristic, to allow for recording characteristics unrelated to projection.

<https://fiafcore.org/ontology/hasImageCharacteristic> a owl:ObjectProperty ;
    rdfs:label "Has Image Characteristic"@en ;
    dc:source "FIAF Cataloguing Manual 3.1.5.9"^^xsd:string ;
    rdfs:domain fiaf:Item ;
    rdfs:range fiaf:ImageCharacteristic .

3.1.5.10 Broadcast Standard

Vocabulary can be found under 3.1.5.10.

<https://fiafcore.org/ontology/hasBroadcastStandard> a owl:ObjectProperty ;
    rdfs:label "Has Broadcast Standard"@en ;
    dc:source "FIAF Cataloguing Manual 3.1.5.10"^^xsd:string ;
    rdfs:domain fiaf:Item ;
    rdfs:range fiaf:BroadcastStandard .

3.1.5.11 Duration 3.1.5.11.1 Duration Precision

Expressed as extent.

3.1.5.12 Frame Rate

Manual recommends drawing from controlled vocabulary rather than allow for integer/float data.

<https://fiafcore.org/ontology/hasFrameRate> a owl:ObjectProperty ;
    rdfs:label "Has Frame Rate"@en ;
    dc:source "FIAF Cataloguing Manual 3.1.5.12"^^xsd:string ;
    rdfs:domain fiaf:Item ;
    rdfs:range fiaf:FrameRate .

3.1.5.13 Base

Vocabulary can be found under D.7.7.

<https://fiafcore.org/ontology/hasBase> a owl:ObjectProperty ;
    rdfs:label "Has Base"@en ;
    dc:source "FIAF Cataloguing Manual 3.1.5.13"^^xsd:string ;
    rdfs:domain fiaf:Item ;
    rdfs:range fiaf:Base .

3.1.5.14 Stock 3.1.5.15 Stock batch

A vocabulary of stocks can be found under D.7.16, noted as being extendable. Stock batch/code should be a datatype property terminating in strings.

<https://fiafcore.org/ontology/hasStock> a owl:ObjectProperty ;
    rdfs:label "Has Stock"@en ;
    dc:source "FIAF Cataloguing Manual 3.1.5.15"^^xsd:string ;
    rdfs:domain fiaf:Item ;
    rdfs:range fiaf:Stock .

3.1.5.16 Video Codec

A vocabulary can be found D.7.10. Both this and Audio Codec should be subclasses of codec. Also worth considering: a single file can have multiple streams of different codecs, so a better model would possibly be item -> hasStream -> stream (type AudioStream) -> hasCodec -> WAV.

<https://fiafcore.org/ontology/hasStream> a owl:ObjectProperty ;
    rdfs:label "Has Stream"@en ;
    dc:source "FIAF Cataloguing Manual 3.1.5.16"^^xsd:string ;
    rdfs:domain fiaf:Item ;
    rdfs:range fiaf:Stream .

3.1.5.17 Audio Codec

See above.

3.1.5.18 Resolution

A vocabulary can be found at D.7.19. As some of these terms ("2k") can be contentious, a possibly replacement could be literal pixel dimensions (eg 1920 by 1080).

<https://fiafcore.org/ontology/hasResolution> a owl:ObjectProperty ;
    rdfs:label "Has Resolution"@en ;
    dc:source "FIAF Cataloguing Manual 3.1.5.18"^^xsd:string ;
    rdfs:domain fiaf:Item ;
    rdfs:range fiaf:Resolution .

3.1.5.19 Line Standard

Possible overlap with resolution for digital instances.

<https://fiafcore.org/ontology/hasLineStandard> a owl:ObjectProperty ;
    rdfs:label "Has Line Standard"@en ;
    dc:source "FIAF Cataloguing Manual 3.1.5.19"^^xsd:string ;
    rdfs:domain fiaf:Item ;
    rdfs:range fiaf:LineStandard .

3.1.5.20 Bit Depth

As with codec, there can be multiple bit depths under a single file (eg even just separate audio and video track). Following the proposal of introducing a stream entity, which could also have a hasBitDepth property.

3.1.5.2 Item Status

Controlled vocabulary under D.7.3. I feel "status" is possibly an ambiguous term.

<https://fiafcore.org/ontology/hasStatus> a owl:ObjectProperty ;
    rdfs:label "Has Status"@en ;
    dc:source "FIAF Cataloguing Manual 3.1.5.2"^^xsd:string ;
    rdfs:domain fiaf:Item ;
    rdfs:range fiaf:Status .

3.1.6.1 Item Condition

Would this property be better placed at carrier level as it pertains explicitly to the physical object?

3.1.6.2 Item Location

Would this property be better placed at carrier level as it pertains explicitly to the physical object?

3.1.5.21 Source Device

Small vocabulary at D.7.20, I would question whether this should be a device to playback the material (as indicated in the vocabulary) or the device used to create the material (ie the source of the item)

<https://fiafcore.org/ontology/hasSourceDevice> a owl:ObjectProperty ;
    rdfs:label "Has Source Device"@en ;
    dc:source "FIAF Cataloguing Manual 3.1.5.21"^^xsd:string ;
    rdfs:domain fiaf:Item ;
    rdfs:range fiaf:SourceDevice .

3.1.5.22 Source Software

As above, I would claim that it is more interesting that a file was created with FFmpeg than it can be played back with VLC.

<https://fiafcore.org/ontology/hasSourceSoftware> a owl:ObjectProperty ;
    rdfs:label "Has Source Software"@en ;
    dc:source "FIAF Cataloguing Manual 3.1.5.22"^^xsd:string ;
    rdfs:domain fiaf:Item ;
    rdfs:range fiaf:SourceSoftware .

3.1.5.23 Transfer Speed

Would question how often this data is retained by archives, if implemented it could use the same FrameRate vocabulary as hasFrameRate.

<https://fiafcore.org/ontology/hasTransferSpeed> a owl:ObjectProperty ;
    rdfs:label "Has Transfer Speed"@en ;
    dc:source "FIAF Cataloguing Manual 3.1.5.23"^^xsd:string ;
    rdfs:domain fiaf:Item ;
    rdfs:range fiaf:FrameRate .

Other Properties

Support for an additional tier for carrier to represent physical item-part.

<https://fiafcore.org/ontology/hasCarrier> a owl:ObjectProperty ;
    rdfs:label "Has Carrier"@en ;
    rdfs:domain fiaf:Item ;
    rdfs:range fiaf:Carrier .

Welcome to the item discussion. A key question for me is which is the key attribute which is used to define the "type" of the item: the format (eg 35mm film), the element (eg original camera negative) - or neither and both are expressed as properties of the item.

Interestingly there is no "Item Type" listed in the manual (at least not in the Appendix K table at then end).

I think previously I had shortened 3.1.4 "Has Element Type" to get away from using "Type" outside of entity classification, but item "Has Element" is clearly wrong - the item is the element, eg my film is an 'original camera neg'. This makes me wonder whether item should have subclasses taken from D.7.8, and a reuse of the "has format" property shared with Manifestation.

Just adding some general comments on the item and carrier divide (or is item part better?)

Our use of the item and carrier level might be very carrier focused, but how we see it, there is very little information belonging at the item level. It is beneficial to have carrier information “summarised” on the item level, but we prefer entering it at the carrier.

For us the item level is merely what the describes a unit of carriers. Typical attributes would be extent in carriers (eg. “this unit should consist of 5 carriers”), an item title (eg. “Film x, Print Y), “function” (equivalent of element), as well as every bit of information describing the “whole”. For example Norwegian censorship logs does not refer to works, but rather to particular prints, but not particular carriers. The relationship to the log and the metadata found in these logs (eg. a summarised length given in meter), belong at the item level.

For us the carrier is the first entity in the system that actually describes something in the physical reality. The carriers we use say what the material actually is. Base, gauge, film stock, aperture, measured length, conditions/treatments, carrier title. A lot of this information is often true for all carriers in an item, but it doesn’t have to be. Putting it at the item level causes a range of issues, the moment it doesn’t-

For first time cataloging it is often easy to describe the carrier in great detail, without knowing which item, manifestation or work it belongs to. We have 100k-200k film carriers in our collection with fairly detailed descriptions that are currently cataloged as orphaned film carriers. Having to deal with these as items at this stage of cataloging is not beneficial.

Depending on your CMS it might be difficult to work in a system where this information have to be entered multiple times for all carriers, or where it is not visible over the item level. These issues are CMS issues not model issues in my opinion. If a CMS is very rigid and you don’t want to go down this route, you should remember that a CMS is not the model. You can put a carrier attribute from the model at the CMS item entity. Similarly I guess you could go down the other route and put item attributes at the CMS carrier level, but I reckon that could cause some issues in a data exchange.

I don’t agree with most of this approach, unfortunately Torbjørn, and not just for CMS rigidity reasons.

We have many millions of Items and carriers, and the model you’re proposing was ruled out in early implementation stages. We are still developing our carrier record but we aim to store carrier-specific data in Carrier record (reel condition, reel extent, container barcode association, etc), but all data that is shared by all carriers would store in the Item: gauge eg 35mm; description eg Internegative; sound status and properties; colour status and properties; acquisition source, date, method; base, stock, etc etc

I don’t think it makes any sense to describe shared properties of all carriers in each carrier – that’s the Item’s job…

But, of course, we may be on our own path with that, others may completely disagree about that! Good debate!

I am extremely sympathetic to pushing data down the tree, as it both gets closer to being verifiable as related directly to primary sources and allowing tolerances for exceptions (eg mixed-base, or mixed-gauge film items), but I would concede that both those exceptions are extremely rare, and a model designed with data exchange in mind should possibly follow the perceived wisdom of the day.

As mentioned in previous conversations I think one of the issues is with CMSs which do not allow higher levels to reach down and summarize data which is explicitly connected to lower tiers - would be interested if anyone has seen a system which does this?

@torbjornbp, I like your suggestion of the term "function", but I wonder how this goes outside of the film context. One of the interesting things about D.7.8 is has equivalency between "internegative" and "DCP", when you see it from the perspective of a "function", "DCP" should maybe just be a (technology agnostic) "release print"?

I might be a carrier extremist, but I can add that our current CMS is working similarly to what @stephenmcconnachie is describing. However, doing that for the last 25 years is what is pointing us in a new direction. Acquisition source is an excellent example! It sits at the item level in our current database, but is very troublesome due to mixed acquisition sources for carriers within items.

We have slowly come to the realisation that a lot of attributes we currently put at the item level is actually not so uniform across carriers. A symptom is that a lot of vocabulary choices for “mixed/see notes” has appeared in our CMS over the years, making our controlled vocabularies less useful and searches worse.

You can solve some of these issues by allowing for multiple cardinality of these troublesome attributes at the item level, but its not a very good workaround as it lacks precision (without typing out more extensive notes). “Which carrier does an instance of an item attribute refer to?”. We might put some information at both the item and the carrier.

I’m all for pragmatism though, so I reckon the model should allow for such attributes being available at both levels. At the moment not having the carrier level is more in accordance with the standard than having it!

@paulduchesne, I think we would use something akin to D.7.8. Even though “DCP” might be more specific and technically precise than “release print”, this is how we currently would do it.

I have a question regarding this property:

3.1.3 Holding Institution

Manual indicates text, but institution should be an entity. Also generalise range as institution so that these entities can be reused in another context.

https://fiafcore.org/ontology/hasHoldingInstitution a owl:ObjectProperty ; rdfs:label "Has Holding Institution"@en ; dc:source "FIAF Cataloguing Manual 3.1.3"^^xsd:string ; rdfs:domain fiaf:Item ; rdfs:range fiaf:Institution .

Why is the holding institution not considered an Agent (if Agent means both person or corporate body)? And having asked this, wasn't there an Agent class in the ontology before that is no longer included in the draft, or am I mistaken?

Why is the holding institution not considered an Agent (if Agent means both person or corporate body)?

Good question, especially as institutions can also hold other production credits (eg producing archival releases) and you would want these to be linked. I suppose you could maybe consider there being a distinction between institution as an organisational unit and a geographical location, but I would be in favour of what you are implying (BFI -> type -> organisation) and maybe reversing the direction? BFI -> has holding -> some film, or some film -> held at -> BFI? By generalizing to agent also allows for the hypothetical inclusion of films which are held by individuals in private collections.

And having asked this, wasn't there an Agent class in the ontology before that is no longer included in the draft, or am I mistaken?

I am pushing that we would want a node in between "work" and "agent". For example, we could express Hal Hartley directing Simple Men as a direct relationship like this:

flowchart LR
    SimpleMen --hasDirector--> HalHartley

But I am a real advocate of adding an extra entity in between these, which can call an "activity" using terminology from the manual.

flowchart LR
    SimpleMen --hasContribution--> BlankNode_TypeDirector --hasAgent--> HalHartley

The purpose of this would be to allow the addition of data points which relate specifically to the intersection of the entities: for instance how much Hal Hartley was paid for this film, how he was credited, or if an actor: character name, screentime, credit ranking.

I am still trying to get my head around how to best express item type, format and other related tech attributes. Using this example from the Bundesarchiv XML:

<Exemplar uuid="56f74877-c131-4baa-8f22-4ee9802ddf42">
    <Medienart>
        FILM
    </Medienart>
    <Signatur>
        B-133150
    </Signatur>
    <ExemplarStatus>
        Unbekannt
    </ExemplarStatus>
    <Filmbreite>
        35 mm
    </Filmbreite>
        <Traeger>
                Triazetatzellulose
    </Traeger>

Signatur, ExemplarStatus, Traeger can all be expressed as Identifier, Status and Base respectively, but I am interested in Medienart and Filmbreite.

Different options:

1) 56f74877-c131-4baa-8f22-4ee9802ddf42 -> has type (rdf:type) -> Item 56f74877-c131-4baa-8f22-4ee9802ddf42 -> has carrier type -> Film 56f74877-c131-4baa-8f22-4ee9802ddf42 -> has gauge -> 35mm

2) 56f74877-c131-4baa-8f22-4ee9802ddf42 -> has type (rdf:type) -> Film (subclass of item) 56f74877-c131-4baa-8f22-4ee9802ddf42 -> has gauge -> 35mm

3) 56f74877-c131-4baa-8f22-4ee9802ddf42 -> has type (rdf:type) -> 35mm Film (subclass of film, subclass of item)

I personally lean towards the third option as I feel it would aid querying (you can ask simply "show me all film" or "show me all 35mm film" as opposed to chaining requests: "show me all film" AND "has a gauge of 35mm"). This though does require a full taxonomy of item types, which taking table D.7.2 literally could look something like this:

graph TD;
    Item-->Film;
    Film-->35mmFilm;
    Film-->16mmFilm;
    Film-->Super16mmFilm;
    Film-->8mmFilm;
    Film-->Super8mmFilm;
    Film-->9.5mmFilm;
    Film-->17.5mmFilm;
    Film-->70mmFilm;
    Item-->Video;
    Video-->1InchVideo;
    Video-->Digibeta;
    Video-->BetacamSP;
    Video-->2InchVideo;
    Video-->HDCAMSR;
    Video-->D1;
    Video-->D5;
    Video-->DVCPROHD;;

etc

The other thing to throw in the mix, we agreed to add carrier - would not the formats above be best expressed at this level given they conform most immediately to the physical nature of the carrier? And if so, could you have your carrier type expressed as the item format, and then free up the item type to be taken from element type (or instantiation from 15907)?

I also wanted to highlight that the Bundesarchiv data has some interesting attributes at carrier level (or Aufbewahrungseinheit): colour, base and gauge. My pragmatic question would be, as these values can be expressed at either level and should be interchangeable in 99% of cases could we not pick a single level ourselves and transform. To unpack this a bit - Archive A only has gauge information at Item level, Archive B only at carrier - I think we should be picking one of these options and either dragging that data up or down, unless I have this wrong and the declaration at a different level actually is significant?

Just returning to item type and I can see a bit of a problem with having an overlapping vocabulary shared between item type and the format of manifestation hasFormat. This is a problem because if the values are shared (eg manifestation > hasFormat > video, and item > has item type > video), it becomes ambiguous whether video is a subclass of format or item.

Assuming we wish to keep both item and manifestation statements of format (which was generally supported in a previous discussion), I see two solutions.

1) create two distinct but overlapping vocabularies for the two classes format and item. I think this is messy and difficult to maintain for limited gain. It is also seems incorrect given "video" is a (mostly) discrete concept.

2) allow for hasFormat as a property of both manifestation and item with the same vocabulary of formats (implied as ideal and actual, which should ultimately be a more explicit distinction).

Pathway 2 means that item again has no subclasses (ie there is no item types), which could allow for using element type in that capacity? This is interesting as I feel the element type says more about the conceptual purpose of the item, similar to how manifestation type primarily communicates function.

This does not strike me as untenable as the carrier tier is then the direct representation of the physical item, although unable to use the format vocabulary (as type) without striking exactly the same issues expressed above.

Applying this to the BA example is interesting because the element type field ("Materialart") is present at carrier level, so what I am proposing for this mapping would be to not only pull it up to item level, but in fact define the item type with an additional statement for the format:

item 56f74877-c131-4baa-8f22-4ee9802ddf42 -> has type (rdf:type) -> Bildduplikatpositiv (aka Image Dupe Pos)

item 56f74877-c131-4baa-8f22-4ee9802ddf42 -> hasFormat -> 35mm Film (subclass of film > format)

As to Item type - I wouldn't mix Item Element Type with Item Type. An Item can consist of more than one Item Elements (something not grasped by EN, where instantiation type has cardinality zero or one). Original picture negative and original sound negative are 2 elements forming one Item. Perhaps there is not need to have Item type at all. (Manifestation Type is publication context type, not technical property.)

In our new system modelling, we propose Item consisting of one or more Subitems and Subitem consisting of one or more Item Elements. Subitem type for analogue film is image, soundtrack or composite, and Item element under Subitem = image is - for instance - original picture negative, and under Subitem = composite it is combined print. So there could be - for example - 1 Item with 1 Subitem = Image having two Item elements (original picture negative and duplicate negative) a 1 Subitem = soundtrack having one Item element (original sound negative). These 3 elements form 1 Item.

So Item Type in our system will be something like "Item model". The example above will be standard not-for-screening sound film model, whereas standard screening sound film model will have 1 subitem = composite with 1 element (composite print).

It may seem to be complicated for other archives but it actually could help us to prescribe allowed combinations of elements. Original picture negative and sound print cannot be combined in a standard model, for instance.

Thank you @ladislav-nfa this is such an interesting perspective. I can't say I have heard of anyone grouping corresponding production components together, but I can advantages (eg copying/digitising history).

My first pass did not have an item type, nor do we have a carrier type, which maybe we could just wear for now. In the vocabularies ticket I was coming around to the idea of the item type being the general carrier type (eg Film, Digital File, etc) which I think @stephenmcconnachie was alluding to last talk, but I don't know if there is much point if it can be unambiguously inferred from more granular format info. Or if more granular format data is missing: if we treat format specifics (eg 35mm film) as subclasses, then we can still retain information that an item is film even if we have no further info re gauge, base, etc.

If I understand correctly, your "subitem" concept is almost like a further tier sitting between item and carrier?

Just to revisit the example from further up the page, this would result in:

item 56f74877-c131-4baa-8f22-4ee9802ddf42 -> has type (rdf:type) -> fiaf:Item (no subclasses) item 56f74877-c131-4baa-8f22-4ee9802ddf42 -> has format -> 35mm Film (subclass of film, subclass of format)

Need to implement the above proposal and then this issue can be closed.

FIAF / modelling-workshops

Item #7

Item Elements