FIAF / modelling-workshops

Modelling Workshops
0 stars 1 forks source link

Vocabularies #9

Closed paulduchesne closed 1 year ago

paulduchesne commented 1 year ago

@torbjornbp @natashafairbairn @circesanchez @annahoegner @Rose-EFG @stephenmcconnachie @ladislav-nfa as we are now at the halfway mark of the workshop series, and making good progress on the ontology structure, I thought it could be a good moment to look at available vocabularies to incorporate into this project.

Below are a list of defined classes, and notes on where vocabularies are available, preferencing those which can be found in the Cataloguing Manual. I might begin adding these in directly and look forward to any discussion on which you think correspond with vocabularies used by institutions and which do not.

~

Work/Variant - actioned here. Both manual and EN15907 agree on a vocabulary of four values (Analytic, Collection, Monographic, Serial) which @natashafairbairn was keen to retain here.

Activity - actioned here. The FIAF Glossary of Filmographic Terms contains a rich, multi-tier and multilingual vocabulary of terms (mostly under section "B").

Agent - actioned here. The manual (1.4.1) lists a vocabulary of "Person", "Corporate Body", "Family" and "Person Group", but I wonder if this could be simplified to simply "Person" and "Organisation" as I have seen elsewhere? Although "Person Group" does address an ongoing problem where persons are credited in combination (eg "Coen Brothers", "The Wachowskis") and are not really a "person" nor an "organisation".

Country - actioned here. The obvious resource here is ISO-3166.

Event - actioned here. The manual (D.4) lists "Publication", "Award/Nomination", "Production", "Rights/Copyright/Registration", "Preservation", "Decision", "Manufacture", "Inspection" and "Acquisition". Each of these contain their own subclasses - my question here would be how these have been adopted by institutions, and whether we should support just the top level or the whole tree.

Form - actioned here. FIAF Glossary of Filmographic Terms contains a list of "Forms" under section "D".

Genre No obvious resources come to mind, would be interesting to dedicate an upcoming talk to compare vocabularies between our institutions and see if there is convergence (or not!).

Identifier - actioned here. The manual (1.3.1) lists ISAN, EIDR and VIAF. I would support adding Wikidata. There should also be support for internal identifiers here (eg BFI, NFA etc), but a technical question, how should these be modeled? It seems tedious to create "identifier" subclasses for each institutions. One approach is do more work on proper data provenance peer statement, and then just use an "internal identifier" class which can be used in combination with provenance data to derive source of identifier.

Subject As a "subject" can be (to my understanding) virtually anything, hard to imagine a comprehensive vocabulary. Maybe the property can be kept open for any resource (eg Wikidata, LoC, etc).

Title - actioned here. The manual (A.2) lists "Alternative Title", "Identifying Title", "Other Title Information", "Preferred Title", "Supplied/Devised Title", "Title Proper". The manual also contains nine additional subclasses of "Alternative Title" (A.2.4.1). As with Event, is it useful to include the full set at this stage or just the top level?

Variant Type - actioned here. The manual (D.2) contains the following vocabulary: "Censored", "Dubbed", "Subtitled", "Abridged/Condensed", "Augmented", "Preservation/Restoration" "Different sound track", "Sonorized", "Colourized", "Black and white copy of work originally issued in colour".

~

Manifestation - actioned here. The manual (D.5) contains a vocabulary: "Pre-Release", "Theatrical Distribution", "Non-Theatrical Distribution", "Not For Release", "Unreleased", "Home Viewing Publication", "Broadcast", "Internet", "Preservation/Restoration", "Unknown Manifestation".

Colour Characteristic - actioned here and here. The manual (D.7.11) contains a vocabulary: "Colour", "Colour + Black & White", "Tinted", "Black and white", "Black and white (tinted)", "Black and white (toned)", "Black and white (tinted and toned)", "Sepia". My understanding is that the Colour Characteristic then has a "Has Colour Standard" property for technical specifics of the colour standards used, which has a vocabulary under D.7.12: "Pathécolor", "Technicolor", "Kinemacolor", "Anscocolor", "Ferraniacolor", "Fujicolor", "Kodachrome", "Eastmancolor", "RGB", "YUV".

Extent - actioned here. The manual contains a vocabulary of Unit Types (D.7.6): "Reel", "Roll", "Cassette", "Cartridge", "Loop", "Disc", "File", "Digital tape". The manual 2.3.5.2 also recommends including "metres" and "feet", and digital measures like "MB", "GB" and "TB". 2.3.5.3 recommends the inclusion of duration, so "seconds", "minutes", "hours".

Format - actioned here. The manual (D.7.2) contains a table of "specific carrier types" which could be nested under the "general carrier types" to serve as format vocabulary. This is clearly an incomplete list, but could serve as a good starting point.

Image Characteristic - actioned here. The manual includes two vocabularies for Aspect Ratio (D.7.14) and Aperture Format (D.7.15). Aspect Ratios listed are: "2.34:1", "2.39:1", "2.52:1", "2.7:1", "4:3". I am wondering if the FIAF Projection Manual has a glossary which could also be used in this context? I feel like the Aperture vocabulary could be assessed critically as I feel it is maybe grouping a few different attributes together which are not siblings.

Language - actioned here. The obvious resource here is ISO-639.

Language Usage - actioned here. The manual (D.6) contains a vocabulary: "Dialogue language(s)", "Written languages", "Language(s) of summaries on containers", "Language(s) of accompanying material". Both "Dialogue" and "Written" have more granular entities.

Sound Characteristic - actioned here, here and here . The manual (D.7.4) contains a vocabulary of sound types: "Sound", "Silent", "Mute", "Combined", "Combined as Mute", "Combined as Sound", "Mixed", "Temporary". It would be good to have clarification of the definition of some of these terms (eg "combined" as in "combined with image"?). Once the general nature of the sound has been declared there are additional properties for "Sound Configuration" (manual 3.1.5.4 suggests "mono" and "stereo") and "Sound system" which has a vocabulary under D.7.13: "Dolby SR", "Dolby Digital", "Mute", "Combined Magnetic Sound", "Combined Optical Sound", "VA RCA Duplex". There is also a vocabulary for "Sound Fixation" at D.7.5: "Needle sound", "Optical", "Magnetic", "Analogue sound", "Digital".

~

Item As discussed in the Item discussion I was in favour of adopting D.7.8 from the manual as the item vocabulary: "Colour Positive", "Colour Negative", "Copper Toned Positive", "Cyan Matrix", "Direct BW Positive", "Original negative", "Duplicate negative", "Positive","Original positive (reversal film)", "Duplicate positive", "Lavender", "Image negative", "Sound negative", "DCP". However much of this is very film-specific, so it would possibly be better expressed as a separate property with "general carrier type" acting as "item type": "Film", "Video Tape", "Video Disc", "Digital Tape", "Digital Disc", "Digital File".

Base - actioned here. The manual (D.7.7) contains a vocabulary of bases: "Acetate", "Acrylic", "CTA", "Diacetate", "Mainly safety", "Mainly nitrate", "Mixed", "Mylar", "Nitrate", "Polyester", "PVC", "Safety", "Video", "Vinyl".

Broadcast Standard - actioned here. The manual (3.1.5.10) contains a vocabulary: "NTSC", "PAL", "SECAM".

Institution - actioned here. A good suggestion from @Rose-EFG is that there should be no "institution" class, instead institutions should be organisation "agents". FIAF itself holds a good vocabulary of member institutions.

Line Standard - actioned here. The manual (D.7.21) contains a vocabulary: "405", "525", "625", "720", "1080"

Resolution - actioned here. The manual (D.7.19) contains a vocabulary: "Standard Definition", "High Definition", "2k", "4k", "6k", "8k".

Source Device - actioned here. The manual (D.7.20) contains a vocabulary: "DVSI", "VT20", "HDCAM SRW5500/2"

Source Software - actioned here. No vocabulary in the manual, I would suggest "FFmpeg" and other transcoder/editors as a good start.

Status - actioned here. The manual (D.7.3) contains a vocabulary: "Master", "Viewing", "Accessioned", "On Loan", "Status pending", "Removed".

Stock - actioned here. The manual (D.7.16) contains an extensive vocabulary. As there is a lot of overlap between manufacturers (eg Fuji produced film and videotape, 3M video and audiotape) I would question whether these should be nested or flat. An additional property should be Stock Batch (or maybe "stock value", "stock specific"?) which contains a string of the exact stock (eg "Panchromatic Separation Film 2238").

Stream - actioned here, here and here . This is the biggest diversion from the manual, and I would propose that a stream can be of type "Audio Stream", "Video Stream", "Subtitle Stream" (or "track"?). Additional properties are "Has Codec", which has vocabularies under D.7.10 in the manual which clearly need expansion. Also "Has Bit Depth", with a vocabulary present at D.7.17 of the manual.

~

Carrier No vocabularies expressed.

natashafairbairn commented 1 year ago

Hi Paul,

Have put a few points in reply in body of your email below.

Hope to see everyone later on at the workshop meeting today.

Best, Natasha

From: Paul Duchesne @.> Sent: 14 February 2023 19:47 To: FIAF/modelling-workshops @.> Cc: Natasha Fairbairn @.>; Mention @.> Subject: [FIAF/modelling-workshops] Vocabularies (Issue #9)

Attention. This email originated outside the BFI. Please be extra vigilant when opening attachments or clicking links.

@torbjornbphttps://url6.mailanyone.net/scanner?m=1pS1Gy-0000QT-4V&d=4%7Cmail%2F90%2F1676403600%2F1pS1Gy-0000QT-4V%7Cin6j%7C57e1b682%7C10717690%7C7962607%7C63EBE5506E134A1E3DA9F7CE5F804736&o=%2Fphti%3A%2Fgtsmbthtco%2Fu.pbnrojbro&s=eVAc0eH4lMGnPFKDG0JWMQU9Jds @natashafairbairnhttps://url6.mailanyone.net/scanner?m=1pS1Gy-0000QT-4V&d=4%7Cmail%2F90%2F1676403600%2F1pS1Gy-0000QT-4V%7Cin6j%7C57e1b682%7C10717690%7C7962607%7C63EBE5506E134A1E3DA9F7CE5F804736&o=%2Fphti%3A%2Fgtsmbthnco%2Fu.asatrafiahnriab&s=XSP4PTuq3-QulisZv3PtQoghKro @circesanchezhttps://url6.mailanyone.net/scanner?m=1pS1Gy-0000QT-4V&d=4%7Cmail%2F90%2F1676403600%2F1pS1Gy-0000QT-4V%7Cin6j%7C57e1b682%7C10717690%7C7962607%7C63EBE5506E134A1E3DA9F7CE5F804736&o=%2Fphti%3A%2Fgtsmbthcco%2Fu.ceireanhcsz&s=Wyf5-1UVZ3lvd32tOme1asGSc3Y @annahoegnerhttps://url6.mailanyone.net/scanner?m=1pS1Gy-0000QT-4V&d=4%7Cmail%2F90%2F1676403600%2F1pS1Gy-0000QT-4V%7Cin6j%7C57e1b682%7C10717690%7C7962607%7C63EBE5506E134A1E3DA9F7CE5F804736&o=%2Fphti%3A%2Fgtsmbthaco%2Fu.nhnnregeao&s=kAIKp3pfUbTCEJ43KX8bABd81kw @Rose-EFGhttps://url6.mailanyone.net/scanner?m=1pS1Gy-0000QT-4V&d=4%7Cmail%2F90%2F1676403600%2F1pS1Gy-0000QT-4V%7Cin6j%7C57e1b682%7C10717690%7C7962607%7C63EBE5506E134A1E3DA9F7CE5F804736&o=%2Fphti%3A%2FgtsmbthRco%2Fu.GFE-eso&s=F95t5J6V6joczeo4xowp1RrjBFg @stephenmcconnachiehttps://url6.mailanyone.net/scanner?m=1pS1Gy-0000QT-4V&d=4%7Cmail%2F90%2F1676403600%2F1pS1Gy-0000QT-4V%7Cin6j%7C57e1b682%7C10717690%7C7962607%7C63EBE5506E134A1E3DA9F7CE5F804736&o=%2Fphti%3A%2Fgtsmbthsco%2Fu.chteonmcpeeihcann&s=COiLpshWnraLi5X0-2GyPD2dK70 @ladislav-nfahttps://url6.mailanyone.net/scanner?m=1pS1Gy-0000QT-4V&d=4%7Cmail%2F90%2F1676403600%2F1pS1Gy-0000QT-4V%7Cin6j%7C57e1b682%7C10717690%7C7962607%7C63EBE5506E134A1E3DA9F7CE5F804736&o=%2Fphti%3A%2Fgtsmbthlco%2Fu.-sadfavnila&s=uOdhcnDf2jGIzrIUq2dTGem1BHE as we are now at the halfway mark of the workshop series, and making good progress on the ontology structure, I thought it could be a good moment to look at available vocabularies to incorporate into this project.

Below are a list of defined classes, and notes on where vocabularies are available, preferencing those which can be found in the Cataloguing Manual. I might begin adding these in directly and look forward to any discussion on which you think correspond with vocabularies used by institutions and which do not.

~

Work/Variant Both manual and EN15907 agree on a vocabulary of four values (Analytic, Collection, Monographic, Serial) which @natashafairbairnhttps://url6.mailanyone.net/scanner?m=1pS1Gy-0000QT-4V&d=4%7Cmail%2F90%2F1676403600%2F1pS1Gy-0000QT-4V%7Cin6j%7C57e1b682%7C10717690%7C7962607%7C63EBE5506E134A1E3DA9F7CE5F804736&o=%2Fphti%3A%2Fgtsmbthnco%2Fu.asatrafiahnriab&s=XSP4PTuq3-QulisZv3PtQoghKro was keen to retain here.

Activity The FIAF Glossary of Filmographic Terms contains a rich, multi-tier and multilingual vocabulary of terms (mostly under section "B").

Agent The manual (1.4.1) lists a vocabulary of "Person", "Corporate Body", "Family" and "Person Group", but I wonder if this could be simplified to simply "Person" and "Organisation" as I have seen elsewhere? Although "Person Group" does address an ongoing problem where persons are credited in combination (eg "Coen Brothers", "The Wachowskis") and are not really a "person" nor an "organisation".

Person Group also needed for instances of music groups, which may be linked to Works as either cast or credits, or if Music Works database also exist in an institutions systems with links to Moving Image Works. We have “Person”, “Organisation” and “Group” designations on records in our names database (also Animal – e.g. name record for Scruffy – a dog featuring in and credited as playing the role “Patsy the dog” in the 1937 film Storm in a Teacup – but let’s not complicate things)

Event The manual (D.4) lists "Publication", "Award/Nomination", "Production", "Rights/Copyright/Registration", "Preservation", "Decision", "Manufacture", "Inspection" and "Acquisition". Each of these contain their own subclasses - my question here would be how these have been adopted by institutions, and whether we should support just the top level or the whole tree.

BFI uses Award/Nomination, Production (which is where locations information and production dates can be found), Publication (e.g. screenings at BFI Southbank cinemas, Slots it has been transmitted in on television (UK television sometimes has themed day or week programmes under what it calls slots or seasons, e.g. Sci-Fi Weekend whereby various TV programmes and films with a science fiction theme are shown, often with a brief introductory programme beforehand). technically speaking, and being purist, screenings would be associated with a Manifestation, but for us it was both easier to migrate legacy data to link to the Work and also better for users (both internal and external) who would be able to find all the information together in one place rather than scattered across multiple Manifestations of a Work.

With Manifestations we have also linked to BBFC classification (which I presume comes under “Registration”).

The other archive-related events such as Preservation, Inspection, Acquisition are all fields within Items rather than separate Events. We also have a separate Accessions database that Items can be linked to.

That is just the BFI though – other archives may have developed and used Events more.

Form FIAF Glossary of Filmographic Terms contains a list of "Forms" under section "D".

Genre No obvious resources come to mind, would be interesting to dedicate an upcoming talk to compare vocabularies between our institutions and see if there is convergence (or not!).

There is no one standard or list, most institutions follow their own or ones based on others, e.g. adapted Library of Congress ones, etc. but there will be some crossover/commonality of terms used, but I suspect possible subtle differences in definitions with some. A lot will depend on collections, user need, and in some Forms and Genre can be together in the one list/field.

Identifier The manual (1.3.1) lists ISAN, EIDR and VIAF. I would support adding Wikidata. There should also be support for internal identifiers here (eg BFI, NFA etc), but a technical question, how should these be modeled? It seems tedious to create "identifier" subclasses for each institutions. One approach is do more work on proper data provenance peer statement, and then just use an "internal identifier" class which can be used in combination with provenance data to derive source of identifier.

Subject As a "subject" can be (to my understanding) virtually anything, hard to imagine a comprehensive vocabulary. Maybe the property can be kept open for any resource (eg Wikidata, LoC, etc).

Some archives may use other Library standards for Subjects as well, e.g. Dewey, UDC, etc. (For its Subject thesaurus construction and architecture the BFI utilises UDC terms and numbers, but adapted to our needs and requirements rather than straightforward replication, so uses a hybrid UDC-in-house basis

Title The manual (A.2) lists "Alternative Title", "Identifying Title", "Other Title Information", "Preferred Title", "Supplied/Devised Title", "Title Proper". The manual also contains nine additional subclasses of "Alternative Title" (A.2.4.1). As with Event, is it useful to include the full set at this stage or just the top level?

Variant Type The manual (D.2) contains the following vocabulary: "Censored", "Dubbed", "Subtitled", "Abridged/Condensed", "Augmented", "Preservation/Restoration" "Different sound track", "Sonorized", "Colourized", "Black and white copy of work originally issued in colour".

Re. all the different terms/values featured in section D of the FIAF Moving Image Cataloguing – it is important to remember that these are purely examples and not full lists. As stated at the beginning of that Section: “The value lists provided in this appendix are usually limited to a minimum of five examples if more comprehensive lists are available. If no pre-existing and authoritative lists are available, a non-exhaustive but more comprehensive set of terms is provided.”

There is also the added footnote “It is recognised that vocabulary lists often require frequent updates, additions or amendments. For this reason, should resources permit, it would be ideal to separate value lists from the rules and locate them in a central, online repository, like metadataregistry.org. RDF-based repositories like this can supply up-to-date vocabularies on demand and have additional advantages over traditional value lists such as those found in this Appendix”. There were hopes to tie in with the FIAF glossaries of terms where possible.

Manifestation The manual (D.5) contains a vocabulary: "Pre-Release", "Theatrical Distribution", "Non-Theatrical Distribution", "Not For Release", "Unreleased", "Home Viewing Publication", "Broadcast", "Internet", "Preservation/Restoration", "Unknown Manifestation".

Colour Characteristic The manual (D.7.11) contains a vocabulary: "Colour", "Colour + Black & White", "Tinted", "Black and white", "Black and white (tinted)", "Black and white (toned)", "Black and white (tinted and toned)", "Sepia". My understanding is that the Colour Characteristic then has a "Has Colour Standard" property for technical specifics of the colour standards used, which has a vocabulary under D.7.12: "Pathécolor", "Technicolor", "Kinemacolor", "Anscocolor", "Ferraniacolor", "Fujicolor", "Kodachrome", "Eastmancolor", "RGB", "YUV".

Extent The manual contains a vocabulary of Unit Types (D.7.6): "Reel", "Roll", "Cassette", "Cartridge", "Loop", "Disc", "File", "Digital tape". The manual 2.3.5.2 also recommends including "metres" and "feet", and digital measures like "MB", "GB" and "TB". 2.3.5.3 recommends the inclusion of duration, so "seconds", "minutes", "hours".

Format The manual (D.7.2) contains a table of "specific carrier types" which could be nested under the "general carrier types" to serve as format vocabulary. This is clearly an incomplete list, but could serve as a good starting point.

Image Characteristic The manual includes two vocabularies for Aspect Ratio (D.7.14) and Aperture Format (D.7.15). Aspect Ratios listed are: "2.34:1", "2.39:1", "2.52:1", "2.7:1", "4:3". I am wondering if the FIAF Projection Manual has a glossary which could also be used in this context? I feel like the Aperture vocabulary could be assessed critically as I feel it is maybe grouping a few different attributes together which are not siblings.

Language The obvious resource here is ISO-639.

Language Usage The manual (D.6) contains a vocabulary: "Dialogue language(s)", "Written languages", "Language(s) of summaries on containers", "Language(s) of accompanying material". Both "Dialogue" and "Written" have more granular entities.

Sound Characteristic The manual (D.7.4) contains a vocabulary of sound types: "Sound", "Silent", "Mute", "Combined", "Combined as Mute", "Combined as Sound", "Mixed", "Temporary". It would be good to have clarification of the definition of some of these terms (eg "combined" as in "combined with image"?). Once the general nature of the sound has been declared there are additional properties for "Sound Configuration" (manual 3.1.5.4 suggests "mono" and "stereo") and "Sound system" which has a vocabulary under D.7.13: "Dolby SR", "Dolby Digital", "Mute", "Combined Magnetic Sound", "Combined Optical Sound", "VA RCA Duplex". There is also a vocabulary for "Sound Fixation" at D.7.5: "Needle sound", "Optical", "Magnetic", "Analogue sound", "Digital".

~

Item As discussed in the Item discussion I was in favour of adopting D.7.8 from the manual as the item vocabulary: "Colour Positive", "Colour Negative", "Copper Toned Positive", "Cyan Matrix", "Direct BW Positive", "Original negative", "Duplicate negative", "Positive","Original positive (reversal film)", "Duplicate positive", "Lavender", "Image negative", "Sound negative", "DCP". However much of this is very film-specific, so it would possibly be better expressed as a separate property with "general carrier type" acting as "item type": "Film", "Video Tape", "Video Disc", "Digital Tape", "Digital Disc", "Digital File".

Base The manual (D.7.7) contains a vocabulary of bases: "Acetate", "Acrylic", "CTA", "Diacetate", "Mainly safety", "Mainly nitrate", "Mixed", "Mylar", "Nitrate", "Polyester", "PVC", "Safety", "Video", "Vinyl".

Broadcast Standard The manual (3.1.5.10) contains a vocabulary: "NTSC", "PAL", "SECAM".

Institution A good suggestion from @Rose-EFGhttps://url6.mailanyone.net/scanner?m=1pS1Gy-0000QT-4V&d=4%7Cmail%2F90%2F1676403600%2F1pS1Gy-0000QT-4V%7Cin6j%7C57e1b682%7C10717690%7C7962607%7C63EBE5506E134A1E3DA9F7CE5F804736&o=%2Fphti%3A%2FgtsmbthRco%2Fu.GFE-eso&s=F95t5J6V6joczeo4xowp1RrjBFg is that there should be no "institution" class, instead institutions should be organisation "agents". FIAF itself holds a good vocabulary of member institutions.

Line Standard The manual (D.7.21) contains a vocabulary: "405", "525", "625", "720", "1080"

Resolution The manual (D.7.19) contains a vocabulary: "Standard Definition", "High Definition", "2k", "4k", "6k", "8k".

Source Device The manual (D.7.20) contains a vocabulary: "DVSI", "VT20", "HDCAM SRW5500/2"

Source Software No vocabulary in the manual, I would suggest "FFmpeg" and other transcoder/editors as a good start.

Status The manual (D.7.3) contains a vocabulary: "Master", "Viewing", "Accessioned", "On Loan", "Status pending", "Removed".

Stock The manual (D.7.16) contains an extensive vocabulary. As there is a lot of overlap between manufacturers (eg Fuji produced film and videotape, 3M video and audiotape) I would question whether these should be nested or flat. An additional property should be Stock Batch (or maybe "stock value", "stock specific"?) which contains a string of the exact stock (eg "Panchromatic Separation Film 2238").

Stream This is the biggest diversion from the manual, and I would propose that a stream can be of type "Audio Stream", "Video Stream", "Subtitle Stream" (or "track"?). Additional properties are "Has Codec", which has vocabularies under D.7.10 in the manual which clearly need expansion. Also "Has Bit Depth", with a vocabulary present at D.7.17 of the manual.

Manual is weak on digital-related vocabularies and sections so would be good to get some vocabularies for those

Carrier No vocabularies expressed.

— Reply to this email directly, view it on GitHubhttps://url6.mailanyone.net/scanner?m=1pS1Gy-0000QT-4V&d=4%7Cmail%2F90%2F1676403600%2F1pS1Gy-0000QT-4V%7Cin6j%7C57e1b682%7C10717690%7C7962607%7C63EBE5506E134A1E3DA9F7CE5F804736&o=%2Fphti%3A%2FgtsmbthFco%2Fu.e%2FIAlodlFmk-inhorsgwu%2Fopsssesi9%2F&s=Tkg8Ek2kotEnVpDt-jg7kdz0FHc, or unsubscribehttps://url6.mailanyone.net/scanner?m=1pS1Gy-0000QT-4V&d=4%7Cmail%2F90%2F1676403600%2F1pS1Gy-0000QT-4V%7Cin6j%7C57e1b682%7C10717690%7C7962607%7C63EBE5506E134A1E3DA9F7CE5F804736&o=%2Fphti%3A%2Fgtsmbthnco%2Fu.tfotocaiiibunscsus%2Fnteri%2Fauhb-IAA566JRFZDHAYZTR36EMWF7APOVTXAFNCAM6ANS6UAAA7ITA3&s=llf3Km8l6yUfyYdzWz4_iuFDtEw. You are receiving this because you were mentioned.Message ID: @.**@.>>

The British Film Institute is governed by Royal Charter and is a charity registered in England and Wales number 287780. The contents of this e-mail are confidential and may be legally privileged. If you are not the intended recipient, kindly notify the sender that you have received this message in error and immediately delete it. Unless you are the intended recipient, you may not forward this e-mail to anybody, nor make any use of its contents.

paulduchesne commented 1 year ago

Thank you for these reponses @natashafairbairn, I have tried to format it below and will respond shortly.

Agent: The manual (1.4.1) lists a vocabulary of "Person", "Corporate Body", "Family" and "Person Group", but I wonder if this could be simplified to simply "Person" and "Organisation" as I have seen elsewhere? Although "Person Group" does address an ongoing problem where persons are credited in combination (eg "Coen Brothers", "The Wachowskis") and are not really a "person" nor an "organisation".

Person Group also needed for instances of music groups, which may be linked to Works as either cast or credits, or if Music Works database also exist in an institutions systems with links to Moving Image Works. We have “Person”, “Organisation” and “Group” designations on records in our names database (also Animal – e.g. name record for Scruffy – a dog featuring in and credited as playing the role “Patsy the dog” in the 1937 film Storm in a Teacup – but let’s not complicate things)

Event: The manual (D.4) lists "Publication", "Award/Nomination", "Production", "Rights/Copyright/Registration", "Preservation", "Decision", "Manufacture", "Inspection" and "Acquisition". Each of these contain their own subclasses - my question here would be how these have been adopted by institutions, and whether we should support just the top level or the whole tree.

BFI uses Award/Nomination, Production (which is where locations information and production dates can be found), Publication (e.g. screenings at BFI Southbank cinemas*, Slots it has been transmitted in on television (UK television sometimes has themed day or week programmes under what it calls slots or seasons, e.g. Sci-Fi Weekend whereby various TV programmes and films with a science fiction theme are shown, often with a brief introductory programme beforehand).

*technically speaking, and being purist, screenings would be associated with a Manifestation, but for us it was both easier to migrate legacy data to link to the Work and also better for users (both internal and external) who would be able to find all the information together in one place rather than scattered across multiple Manifestations of a Work.

With Manifestations we have also linked to BBFC classification (which I presume comes under “Registration”).

The other archive-related events such as Preservation, Inspection, Acquisition are all fields within Items rather than separate Events. We also have a separate Accessions database that Items can be linked to.

That is just the BFI though – other archives may have developed and used Events more.

Genre: No obvious resources come to mind, would be interesting to dedicate an upcoming talk to compare vocabularies between our institutions and see if there is convergence (or not!).

There is no one standard or list, most institutions follow their own or ones based on others, e.g. adapted Library of Congress ones, etc. but there will be some crossover/commonality of terms used, but I suspect possible subtle differences in definitions with some. A lot will depend on collections, user need, and in some Forms and Genre can be together in the one list/field.

Subject: As a "subject" can be (to my understanding) virtually anything, hard to imagine a comprehensive vocabulary. Maybe the property can be kept open for any resource (eg Wikidata, LoC, etc).

Some archives may use other Library standards for Subjects as well, e.g. Dewey, UDC, etc. (For its Subject thesaurus construction and architecture the BFI utilises UDC terms and numbers, but adapted to our needs and requirements rather than straightforward replication, so uses a hybrid UDC-in-house basis

Variant Type: The manual (D.2) contains the following vocabulary: "Censored", "Dubbed", "Subtitled", "Abridged/Condensed", "Augmented", "Preservation/Restoration" "Different sound track", "Sonorized", "Colourized", "Black and white copy of work originally issued in colour".

Re. all the different terms/values featured in section D of the FIAF Moving Image Cataloguing – it is important to remember that these are purely examples and not full lists. As stated at the beginning of that Section: “The value lists provided in this appendix are usually limited to a minimum of five examples if more comprehensive lists are available. If no pre-existing and authoritative lists are available, a non-exhaustive but more comprehensive set of terms is provided.”

There is also the added footnote “It is recognised that vocabulary lists often require frequent updates, additions or amendments. For this reason, should resources permit, it would be ideal to separate value lists from the rules and locate them in a central, online repository, like metadataregistry.org. RDF-based repositories like this can supply up-to-date vocabularies on demand and have additional advantages over traditional value lists such as those found in this Appendix”. There were hopes to tie in with the FIAF glossaries of terms where possible.

paulduchesne commented 1 year ago

Person Group also needed for instances of music groups, which may be linked to Works as either cast or credits, or if Music Works database also exist in an institutions systems with links to Moving Image Works. We have “Person”, “Organisation” and “Group” designations on records in our names database (also Animal – e.g. name record for Scruffy – a dog featuring in and credited as playing the role “Patsy the dog” in the 1937 film Storm in a Teacup – but let’s not complicate things)

Thank you for the clarification. Can I ask about "family", is this meant as a literal family? Another concern was that "Corporate Body" would not include informal, non-corporate organisations (eg artist collectives, musical groups) so it makes sense that "Person Group" can cover those entities.

BFI uses Award/Nomination, Production (which is where locations information and production dates can be found), Publication (e.g. screenings at BFI Southbank cinemas*...

Could I ask if you also use the event vocabularies (for example the "D.11 Production Event Types")? As mentioned I realise there is a bit of overlap between the Manifestation type (going with our current definition, often linked closely to a discrete event) and the event itself.

Re. all the different terms/values featured in section D of the FIAF Moving Image Cataloguing – it is important to remember that these are purely examples and not full lists.

I see this as being a key strategic question for moving forward. My initial thought was that the first iteration should be without vocabularies, but I am now thinking that maybe a good way forward is to use these example vocabularies as a starting point, take them out into the world and inevitably iterate, and developments could also maybe be fed back into the glossaries/manual.

natashafairbairn commented 1 year ago

Hi Paul,

Answers to your further questions within your email below.

Best, Natasha

From: Paul Duchesne @.> Sent: 16 February 2023 10:38 To: FIAF/modelling-workshops @.> Cc: Natasha Fairbairn @.>; Mention @.> Subject: Re: [FIAF/modelling-workshops] Vocabularies (Issue #9)

Attention. This email originated outside the BFI. Please be extra vigilant when opening attachments or clicking links.

Person Group also needed for instances of music groups, which may be linked to Works as either cast or credits, or if Music Works database also exist in an institutions systems with links to Moving Image Works. We have “Person”, “Organisation” and “Group” designations on records in our names database (also Animal – e.g. name record for Scruffy – a dog featuring in and credited as playing the role “Patsy the dog” in the 1937 film Storm in a Teacup – but let’s not complicate things)

Thank you for the clarification. Can I ask about "family", is this meant as a literal family?

Yes – I think it’s definition will be as in FRAD (Functional Requirements for Authority Data) see section 4.2 (p.18) https://www.ifla.org/wp-content/uploads/2019/05/assets/cataloguing/frad/frad_2013.pdf

Another concern was that "Corporate Body" would not include informal, non-corporate organisations (eg artist collectives, musical groups) so it makes sense that "Person Group" can cover those entities.

I strongly suspect these were all straight from FRBR/FRAD terms and concepts.

We could consider “Collective Agent” which would cover corporate and non-corporate bodies or organisations, and is what LRM (Library Reference Model) uses. The attributes of a Coporate Body in FRAD does actually include all organisation and music groups – but to me the term is misleading and makes you think of corporations and “collective agent” is better and makes more sense. You can then further define by type of collective agent within an actual record for the name (e.g. we will attach types such production company, distributor, broadcast company, educational body, club, charity, studio, etc. to our Organisation name records). Our third category of Group is always only used with music groups and orchestras.

See bottom of section 4.3 (p.20) here https://www.ifla.org/wp-content/uploads/2019/05/assets/cataloguing/frad/frad_2013.pdf

Also: Collective Agent FRBR originally defined a corporate body as “an organization or group of individuals and/or organizations acting as a unit,” as shown in table 7. The second subclass of “Agent” in LRM is “Collective Agent.” LRM’s definition of “Collective Agent” is like the FRBR definition for Corporate Body, but it further defines the entity as “a gathering or organization of persons bearing a particular name,” emphasizing the group acting as a unit is a named group. What distinguishes a “Collective Agent” from a gathering of people is that the name “must be a specific name and not just a generic description for the gathering.”26 While families and corporate bodies are no longer LRM entities, the scope notes for “Collective Agent” explains that they are specific types that “may be relevant in a particular bibliographic application.” This explanation is followed by RDA in its use of “Collective Agent.” Its definition in the RDA Beta toolkit is similar to the LRM definition, but also defines the entity as an entity super-type with two entity sub-types, Family and Corporate Body. The RDA definition for Corporate Body requires the group of persons or organizations to be identified by a name, just as LRM does for “Collective Agent.” The other sub-type, Family, matches the original FRAD definition.

https://journals.ala.org/index.php/lrts/article/view/7345/10100

BFI uses Award/Nomination, Production (which is where locations information and production dates can be found), Publication (e.g. screenings at BFI Southbank cinemas*...

Could I ask if you also use the event vocabularies (for example the "D.11 Production Event Types")? As mentioned I realise there is a bit of overlap between the Manifestation type (going with our current definition, often linked closely to a discrete event) and the event itself.

No we don’t use any of those, largely because we are highly unlikely to have the information in most cases.

For production Event records we have a basic Locations, status, start and end dates, notes field.

But Events is an area that we haven’t really developed – it contains fields that matched migrating data from older systems.

All our Events records have a General Event Type and a Specific Event Type field, e.g. general event = Publication specific event = NFT screening. I can forward over what features in these terms lists if useful.

Re. all the different terms/values featured in section D of the FIAF Moving Image Cataloguing – it is important to remember that these are purely examples and not full lists.

I see this as being a key strategic question for moving forward. My initial thought was that the first iteration should be without vocabularies, but I am now thinking that maybe a good way forward is to use these example vocabularies as a starting point, take them out into the world and inevitably iterate, and developments could also maybe be fed back into the glossaries/manual.

— Reply to this email directly, view it on GitHubhttps://url6.mailanyone.net/scanner?m=1pSbe6-0004bf-5c&d=4%7Cmail%2F90%2F1676543400%2F1pSbe6-0004bf-5c%7Cin6g%7C57e1b682%7C10717690%7C7962607%7C63EE077ABDC04D76E9AA86E59A1516E7&o=%2Fphti%3A%2FgtsmbthFco%2Fu.e%2FIAlodlFmk-inhorsgwu%2Fopsssesiei%2F9osuc%23s4nmm2-13et898478&s=Ml3NFDTcQsibxYnLVbPX4dRkyaQ, or unsubscribehttps://url6.mailanyone.net/scanner?m=1pSbe6-0004bf-5c&d=4%7Cmail%2F90%2F1676543400%2F1pSbe6-0004bf-5c%7Cin6g%7C57e1b682%7C10717690%7C7962607%7C63EE077ABDC04D76E9AA86E59A1516E7&o=%2Fphti%3A%2Fgtsmbthnco%2Fu.tfotocaiiibunscsus%2Fnteri%2Fauhb-FAA5W6NNFZ7IYVLFGWJPPWADAX7JLXAFNCAM6ANS6UAAA7ITA3&s=J9Tf0RNH941sZffMcvmGidTQwDQ. You are receiving this because you were mentioned.Message ID: @.**@.>>

The British Film Institute is governed by Royal Charter and is a charity registered in England and Wales number 287780. The contents of this e-mail are confidential and may be legally privileged. If you are not the intended recipient, kindly notify the sender that you have received this message in error and immediately delete it. Unless you are the intended recipient, you may not forward this e-mail to anybody, nor make any use of its contents.

ladislav-nfa commented 1 year ago

This is our proposal for controlled vocabularies lists for our new system (just for some elements). The Item Element typology is perhaps the only elaborated CV of all in the table. Lists often employ hierarchical ordering (such as negative / original negative / original picture negative) - see Parent column. Any comments to Item Element typology or other parts will be of course highly appreciated.

natashafairbairn commented 1 year ago

Re. Carrier No vocabularies expressed.

Ones we have currently are largely in fields for "Executor", "Authoriser" and "Contact" to do with location movement and quality. These are the names of certain internal BFI staff who work with the physical Items which sit in a Thesaurus. We are also envisioning controlled list in a thesaurus for "movement method". However, we haven't really developed our Carrier records fully and there is no pre-existing data to migrate into the few fields mentioned above that would contain controlled vocabularies.

ladislav-nfa commented 1 year ago

And there are controlled vocabularies in our current old cataloguing system here. (slightly inconsistent as you can see - due to 30 years of extending it in very old software in COBOL.)

paulduchesne commented 1 year ago

All cataloguing manual (and iso) vocabularies have now been added, so I will close this ticket.