TEIC / TEI

The Text Encoding Initiative Guidelines
https://www.tei-c.org
Other
276 stars 88 forks source link

msIdentifier constraint should test node instead of its name; and does it make sense? #2258

Open sydb opened 2 years ago

sydb commented 2 years ago

The "msId_minimal" test in Source/Specs/msIdentifier.xml reads

not(parent::tei:msPart) and (local-name(*[1])='idno' or local-name(*[1])='altIdentifier' or normalize-space(.)='')

That is a bit confusing, and the use of local-name() is generally frowned upon when the node itself can just be tested:

not(parent::tei:msPart) and (child::*[1][self::tei:idno|self::tei:altIdentifier] or normalize-space(.) eq '')

But, more importantly, the relationship between that test and the content of the report it appears on seems a bit off. The report content is “An msIdentifier must contain either a repository or location”. So if the 1st child is an <idno>, that means there are no <placeName>, <bloc>, <country>, <region>, <settlement>, <district>, <geogName>, <institution>, <repository>, or <collection> elements, so it makes sense to complain that there is no repository or location. But why is this message not issued when the first child is a <collection>? A collection is neither a location or a repository, is it? For that matter, this message is not issued if the 1st and only child element is an <msName> or <objectName>, as long as it has content. And what is this business about there being no location if there is no textual content? The following <msIdentifier> is, IMHO, problematic, but not because it does not contain a repository or location, bur rather because that’s all it has:

   <msIdentifier>
     <placeName ref="#CambridgeMA"/>
     <repository ref="#Houghton"/>
   </msIdentifier>

And the fact that mss identifiers inside <msPart> are exempted does not seem to be explained anywhere. (I, for one, don’t get it, either. But I am trying to work on #2188 now, so have not really looked into this; but since I do not study manuscripts, I might not understand the explanation if I found it.)

lb42 commented 2 years ago

The work group was quite keen on the idea that a minimal msIdentifier might only tell you vaguely where to find the thing, citing cases where a collection (or institution) has only one or two mss.

sydb commented 2 years ago

OK, thanks @lb42. That explains why

   <msIdentifier>
     <placeName ref="#SmallvilleKS"/>
     <repository ref="#LangLib"/>
   </msIdentifier>

might be a perfectly acceptable minimal manuscript identifier. But it does not explain why the error “An msIdentifier must contain either a repository or location” should be issued for it, but not for

   <msIdentifier>
     <msName>Notes for Laura Lang’s graduation speech, May 1977</msName>
   </msIdentifier>

which does not actually contain a repository or location. Note that this message is appropriately not issued for

   <msIdentifier>
     <placeName>Smallville, KS</placeName>
     <repository>Laura Lang Memorial Library</repository>
   </msIdentifier>
martinascholger commented 2 years ago

Council at F2F:

1) We should improve the prose of the Schematron message. For example, this message (“An msIdentifier must contain either a repository or location.”) fires if we introduce an empty <collection>, when just adding contents to the element would make it valid 2) The test shouldn’t force that the elements are not empty (as SB showed, examples with @key are perfectly valid)

@HelenaSabel and @sydb will review the schema of this element.

jamescummings commented 1 year ago

What about if the schematron message just said something like "msIdentifier needs to have identifying information" or something vague like that. I think it isn't about specific elements, but that something is there. msPart is a special case, presumably, because it is a part of an msDesc that is required to have an identifier.