TEIC / TEI

The Text Encoding Initiative Guidelines
https://www.tei-c.org
Other
274 stars 88 forks source link

Update definition of <graphic> #1439

Closed Martin-de-la-Iglesia closed 8 years ago

Martin-de-la-Iglesia commented 8 years ago

The current definition of <graphic> is: "indicates the location of an inline graphic, illustration, or figure", but even in the Guidelines there are examples of <graphic> being used for facsimile pages. Therefore, the word "inline" in the definition doesn't apply anymore. The definition should be changed to something like: "indicates the location of an inline or other graphic, illustration, or figure. "

See also the discussion on TEI-L.

lb42 commented 8 years ago

All such cases -- where the specification in one part of the Guidelines seems to contradict what is recommended or demonstrated in another part -- should be regarded as errors and reported as issues for discussion. Thanks for doing so here: clearly the description for <graphic> was not updated when the recommendation for using it within <facsimile> was made, nor when this was subsequently extended to <sourceDoc> The meaning of a <graphic> appearing as a child of these elements is subtly different from that of one appearing as a child of <text> : only in the latter case can the graphic be considered as providing a constituent component of the document, rather than a version of some part of it. Probably we should add a comment to that effect somewhere.

lb42 commented 8 years ago

I changed the desc to the following: "indicates the location of a graphic or illustration, either forming part of a text, or providing an image of it." and added a cross reference to the section of PH where <facsimile> and friends are discussed. I also added the following note: "Within the body of a text, a <graphic> element indicates the presence of a graphic component in the source itself. Within a <facsimile> or <sourceDoc> element, however, a <graphic> element provides an additional digital representation of some part of the source being encoded."

Martin-de-la-Iglesia commented 8 years ago

What about <graphic> within <surface> (i.e. without <facsimile>)?

lb42 commented 8 years ago

<surface> is permitted only within either <facsimile> or <sourceDoc> so this cannot happen.

Martin-de-la-Iglesia commented 8 years ago

I find it a bit unclear because "within a <facsimile> or <sourceDoc> element" can be read as either only children nodes or also other descendants.

lb42 commented 8 years ago

But you don't find "within the body of a text" similarly ambiguous? Well, it's easy enough to change. How about "Within the context of a <facsimile> ..." ?

Martin-de-la-Iglesia commented 8 years ago

Or "within elements from the module for Transcription of Primary Sources"?

lb42 commented 8 years ago

Err, no, That seems much less precise! Have you seen how many elements that module provides?

Martin-de-la-Iglesia commented 5 years ago

SIG Text & Graphics have identified another problem though: multiple <graphic> elements, i.e. a structure like <surface><graphic/><graphic/></surface>, are ambiguous. It can mean either 2 representations of the same surface, or a composite surface consisting of 2 parts. The current Guidelines example

<surface> <graphic url="page2-highRes.png"/> <graphic url="page2-lowRes.png"/> </surface>

is not helpful in this regard, because the semantics ("these are 2 images of the same surface, 1 in high resolution and 1 in low resolution") are in the file names only. We suggest extending this example by adding @exclude:

<surface> <graphic xml:id="a" exclude="#b" url="page2-highRes.png"/> <graphic xml:id="b" exclude="#a" url="page2-lowRes.png"/> </surface>

and adding another example of a composite image in which the <graphic>s are linked with @next and @prev:

<surface> <graphic xml:id="a" next="#b" url="page2-top.png"/> <graphic xml:id="b" prev="#a" url="page2-bottom.png"/> </surface>

(Admittedly, this still leaves open the question of how to deal with composite images in which the parts are not arranged in a clear sequence.)

vbigot-juloux commented 5 years ago

As for cuneiform, <surface> has several <graphic> especially when looking at each glyph, especially for a paleographic interest level that is essential in order to understand the meaning of a group of “words”. By saying this, especially in Akkadian/Sumerian (i.e. logogram), a glyph can be read differently. So we analyze each glyph (= several <graphic>) in a <surface>.

(Admittedly, this still leaves open the question of how to deal with composite images in which the parts are not arranged in a clear sequence.)

We need also to consider broken/missing glyph.

jamescummings commented 5 years ago

@Martin-de-la-Iglesia Wouldn't a more straightforward solution be to have graphic claim membership in the att.typed class? This would give it type and subtype attributes. We have a standard formula for att.typed that if it is a) repeatable, and b) classifiable then it should probably get it if someone asks for it.

That a surface has high-res or low-res images doesn't necessarily mean that they are in exclusion to each other... that really depends on how you are using the XML file or resources generated from it.

I like @lb42's update of the desc well enough, and would also suggest we give it membership of att.typed.

Martin-de-la-Iglesia commented 5 years ago

I don't oppose adding <graphic> to att.typed, but isn't it rather the relation amongst the <graphic> elements and/or between <graphic> and <surface> that needs to be specified, rather than the individual <graphic>s?

I also still think that @exclude would be appropriate in this case, but I must say all of those att.global.linking attributes are somewhat vague and elusive to me.

lb42 commented 5 years ago

The intention behind permitting multiple <graphic>s within a single <surface> was that they would necessarily always be alternate representations of the (whole of the) surface. If they represent different parts of the surface, they should be wrapped in <zone> elements to say which part it is. So I don't think the example is ambiguous, though clearly the documentation could be improved.

Martin-de-la-Iglesia commented 5 years ago

How curious. It seems counterintuitive to me that

<surface>
   <graphic/>
   <graphic/>
</surface>

should necessarily mean something different than

<surface>
   <zone>
      <graphic/>
   </zone>
   <zone>
      <graphic/>
   </zone>
</surface>
lb42 commented 5 years ago

Well, I can only repeat: that's what the intention was. A nested <graphic> is always a representation of the 2d space defined by its parent (the same consideration applies to <zone>). Maybe we should have made it an attribute instead!

vbigot-juloux commented 5 years ago

Based on your explanation, will it be correct then for a Cuneiform broken tablet to write:

<surfaceGrp> <!-- one tablet -->
 <surface> <!-- 1st fragment -->
    <zone>
       <graphic/> <!-- glyph1 -->
    </zone>
    <zone>
       <graphic/>  <!-- glyph2 -->
    </zone>
 </surface>
 <surface> <!-- 2nd fragment -->
    <zone>
       <graphic/>  <!-- glyph3 -->
    </zone>
</surface>
</surfaceGrp>
lb42 commented 5 years ago

Yes. And you can indicate whereabouts on each tablet the various glyphs are too.