TEIC / TEI

The Text Encoding Initiative Guidelines
https://www.tei-c.org
Other
271 stars 88 forks source link

msContents/msItem should be replaced in tei:object with something non-MS specific #1851

Open jamescummings opened 5 years ago

jamescummings commented 5 years ago

Although the elements msContents, msItem, and msItemStruct have been updated with regard to their use in the object element to say 'or any object', they perhaps should be replaced with something more generalised for object description.

I don't have a clear proposal at the moment for what this should be, but I'm putting in this issue as a reminder to develop one.

jamescummings commented 5 years ago

Thinking about this what we need is a way to conceptually describe the intellectual contents of something that would work as a variety of things such as table of contents, a list of msItems, nested descriptions of mosaic on the side of a building (c.f. UNAM Central Library), and the artistic contents of a triptych (all panels all sides). etc.

jamescummings commented 4 years ago

On 2019-07-25 sent following email to TEI-L to encourage any other comments from the community.

===


The creation of the <object> element (and <listObject>, <objectName>, <objectIdentifier>) for describing physical objects was based very much on the <msDesc> element (this being a very specific type of object) in an earlier release. At the time of that release (3.5.0 in January 2019) some elements related to manuscript description were changed to add the phrase 'or other object' into their descriptions to enable their use inside the description of a non-manuscript (or even non-text-bearing) object. (Though, to be honest, TEI has long encouraged the use of manuscript description for other text-bearing objects of sufficient concern to have a detailed description, regardless of whether they are actually manuscripts.)

While some elements like <msIdentifier> were replaced by <objectIdentifier>, the <msContents> and <msItem> elements were kept (though with the 'or other object' in their descriptions) so that we could gather more opinions about what an <msContents> for physical objects should include. In many ways <msContents> act as a table-of-contents for the intellectual contents of a manuscript, so a similar element for objects would need to encode the same kind of information. 

Some examples used in the Guidelines of objects included the Mask of Tutankhamun, the Alfred Jewel, Excalibur, etc. And one that I've been using to think about another issue is a building, the mosaic-covered Central Library at UNAM https://en.wikipedia.org/wiki/Central_Library_(UNAM). 

Whatever this element might be called (<contents>? <objectContents>?) maybe only needs a list of <item>s inside it? Maybe there are better ways to describe the individual portions/sections/items of an object and their intellectual contents? In raising this issue https://github.com/TEIC/TEI/issues/1851 I wanted to involve others interested in the documentation of objects to comment (here or even better on the github issue) to get a wider range of viewpoints. So if you have any thoughts on describing the intellectual contents of objects, I'd be interested to hear them.

Many thanks,
James 

--
Dr James Cummings, James.Cummings@newcastle.ac.uk
Senior Lecturer in Late-Medieval Literature and Digital Humanities
School of English, Newcastle University

===

richardofsussex commented 4 years ago

I put this to the MCG list. One respondent (Paul Vetch) said:

There’s already been a high level agreement to harmonise CIDOC-CRM and TEI - I’d have thought the solution here would lie in that direction.

Another (Rupert Shepherd) said:

Is it worth pointing our TEI colleagues towards how this is conceptualised in Spectrum, or object cataloguing standards like CCO?

For example, Spectrum lists the following kinds of content that might be described for an object: · Content – activity · Content – concept · Content – date · Content – description · Content – event name

o Content – event name type · Content – language · Content – note · Content – object

o Content – object type · Content – organisation (Org) · Content – other

o Content – other type · Content – people (Peo) · Content – person (Per) · Content – place (Pla) · Content – position · Content – script

Presumably, some of these map to existing TEI elements, and could be added to the list of possible children of an element. But, as noted in the email below, there’s also a question about whether you might want to sub-divide the objects into different aspects/parts/components, each of which has a different content …

Commented added by Richard Light

jamescummings commented 4 years ago

Hi Richard, That is very helpful, thank you. I think most of those map to TEI elements fairly cleanly. (With 'concept' maybe being less clear here, remembering these are physical objects, not conceptual ones.) High-level parity with CIDOC CRM and Spectrum seem desirable to me. I am less familiar with Spectrum so will try to investigate it some more.

jamescummings commented 3 years ago

I've been starting to think again about how to create an <objectContents> to replace msContents/msItem. msContents is basically the 'table of contents' of the manuscript being described. So it lists the works inside it bibliographically.

1) Replacing msItem: I have been trying to see if there is a way to do without the need for specialised <objectItem> element or something like that. <msItem> should be seen as a specialisation of whatever is available under similar structures in <objectContents> so I've started by looking at whether that could be replace merely by a list of items. The content model of msItem is currently:

<content>
 <sequence>
  <alternate minOccurs="0" maxOccurs="unbounded">
   <elementRef key="locus"/>
   <elementRef key="locusGrp"/>
  </alternate>
  <alternate>
   <classRef key="model.pLike" minOccurs="1" maxOccurs="unbounded"/>
   <alternate minOccurs="1" maxOccurs="unbounded">
    <classRef key="model.titlepagePart"/>
    <classRef key="model.msItemPart"/>
    <classRef key="model.global"/>
   </alternate>
  </alternate>
 </sequence>
</content>

Generally this is a paragraph or structured content preceded by a locus at the start. I was hoping that all of this would be available in <item> already, and that'd be job done, except this isn't the case. If you compare the may contain sections from item and msItem, you'll see that item has more elements and includes all these elements except those coming from model.titlepagePart and model.msItemPart:

Now those provided by the core and header modules are all available inside item/bibl and because those are specifically bibliographic one could just make the recommendation that where necessary the bibliographic metadata is wrapped together by a bibl for each item.

However, what this doesn't allow is titlepagePart elements or the msItemPart elements, while there are some nonsensical roundabout ways of getting to those elements inside an item, I don't think we'd want to be recommending them. Also it doesn't provide the locus at the start like msItem enforces -- I don't know if that is necessary in other objects, but the idea of it is that the msItem starts by saying which bit of the object it applies to. So, it might say which side of a building or wall, or which face of an object. I'd be interested in ways of solving this but am loathe to suggest making the members of titlepagePart and msItemPart available in the content model of item. The tension here is between sensible constraint and chaotic entropy.

If there was a new element then <objectItem> seems an obvious name and presumably it would have a similar content model to msItem but include list and listBibl (but not model.listLike since that gets all the other lists). This would enable ad hoc lists of 'contents' of an object of any form without preconceptions as to its textual nature. objectItem might have a content model something like:

<content>
 <sequence>
  <alternate minOccurs="0" maxOccurs="unbounded">
   <elementRef key="locus"/>
   <elementRef key="locusGrp"/>
  </alternate>
  <alternate>
   <classRef key="model.pLike" minOccurs="1" maxOccurs="unbounded"/>
   <alternate minOccurs="1" maxOccurs="unbounded">
    <classRef key="model.titlepagePart"/>
    <classRef key="model.msItemPart"/>
    <classRef key="model.global"/>
     <elementRef key="objectItem"/> <!-- nesting is required -->
     <elementRef key="list"/>
     <elementRef key="listBibl"/> <!-- these two added directly rather than model.listLike to stop all the other lists -->
   </alternate>
  </alternate>
 </sequence>
</content>

While I wanted to avoid introducing a new element, I think corrupting item or bibl further would just add to the chaotic entropy.

2) Replacing msContents: This seems more straightforward. The current content model of msContents is:

<content>
 <alternate>
  <classRef key="model.pLike" minOccurs="1" maxOccurs="unbounded"/>
  <sequence>
   <elementRef key="summary" minOccurs="0"/>
   <elementRef key="textLang" minOccurs="0"/>
   <elementRef key="titlePage" minOccurs="0"/>
   <alternate minOccurs="0" maxOccurs="unbounded">
    <elementRef key="msItem"/>
    <elementRef key="msItemStruct"/>
   </alternate>
  </sequence>
 </alternate>
</content>

which basically means that you have a choice between a paragraph or more structured information and that starts with a summary before some msItems. Now, as we've seen above msItem might be replaced by objectItem (and msItem/msItemStruct not available here -- use msContents in those cases). I'd also be tempted to allow listBibl here as well.

<content>
 <alternate>
  <classRef key="model.pLike" minOccurs="1" maxOccurs="unbounded"/>
  <sequence>
   <elementRef key="summary" minOccurs="0"/>
   <elementRef key="textLang" minOccurs="0"/>
   <elementRef key="titlePage" minOccurs="0"/>
   <alternate minOccurs="0" maxOccurs="unbounded">
    <elementRef key="objectItem"/>
    <elementRef key="list"/>
    <elementRef key="listBibl"/>
   </alternate>
  </sequence>
 </alternate>
</content>
JanelleJenstad commented 3 years ago

Discussed at SVF2F of full council. Next step is to work up some examples, specifically where one would want an intellectual table of contents. Also examples of objects containing objects.

JanelleJenstad commented 3 years ago

This is "go" for developing examples to help decide on whether we need new elements. Do not merge pull requests yet.

jamescummings commented 2 years ago

In order to progress this, should we solicit examples of objects needing an <objectContents> and nested <objectItems> from TEI-L?

JanelleJenstad commented 2 years ago

Yes we should.

schassan commented 2 years ago

Jumping in as aftermath of some discussions we had yesterday at the conference and some we had within the project group of the Handschriftenportal, I would add that the type of content needs to be distinguished. I would think that -even beyond the content of a manuscript- at least textual contents, artistical contents (illuminations, decoration, etc), and musical contents could be distinguished. There may be more, but these are what we came across in the scope of manuscript descriptions.

Within the Handschriftenportal we think about a structure that enables cataloguers (i.e. anyone who describes any kind of object) to group these aspects because they may appear simultaniously. Thus, we are about to implement such a structure:

element msItem { attributes here, ( ( note[@type='text'] | note[@type='music'] | decoNote[@type='content'] )+ | decoNote[@type='form']* ) }

Three elements will be used to describe the above mentioned types of content, but -especially within the description of an illuminated manuscript- the cataloguer should be enabled to describe aspects that belong to the physical appearence alongside the content related aspects. In "normal" descriptions they would have had their place within physDesc/decoDesc.

I will bring forward these ideas in the ms-sig but a more abstract definition of how to describe any object probably also should take these thoughts into account.

larkvi commented 1 year ago

Discussion at the msDesc SIG meeting at TEI2022, agreed that there were a variety of issues with identifying the ontological characteristics of parts of a manuscript that may have been produced or circulated separately but were then bound together, specifically bindings and endleaves. While @schassan suggested that these might be modelled as msParts there was no general agreement on the specific way to model them. It seems to me that this is an extension of this issue and that bindings, including current and former bindings, might be reasonably modelled as objects.

Similarly, for the Charters Encoding Initiative/Monasterium/DiDip (@gvogeler), we would like to see the ontological relationship of seals to their associated diploma/manuscript reflect their status as separate items which might be fully described in and of themselves in the TEI, then associated through binding to the object as part of a singular charter which is being catalogued as one item. Again, object seems like the appropriate terminology to me, but we would like to use whatever level of item is being used for bindings and other Sammelbände items, to represent their similar ontological characteristics.

A specific example of the utility of seals being separate objects within the markup of a charter is the case when valid seals have been cut off of their original documents and attached to a forgery. A famous case of this is the Privilegium maius, establishing the rank of "Arch-Duke" for the Hapsburg Dukes of Austria.