Distinguishing manuscript shelfmarks from digital object identifiers

eu-genia commented 6 months ago

We should develop a way to distinguish - and map differently - IDs of manuscripts and IDs of image sets and possibly introduce typeand relationto describe the situation.

E.g. we have cases of one manuscript photographed several times, in which case it is fully justified to have one shelfmark for the manuscript and distinct IDs for each set of images. But these IDs and their types must be clearly distinguished from the cases where the same set of images has multiple IDs and URIs (as in the case of EAP and EMDA or the Gunda Gunde mss at Toronto and at HMML or EMIP etc etc)

For manuscripts still in Ethiopia in the records I have been polishing I have been doing e.g.

<msIdentifier>
                  <repository ref="INS0005Gund"/>                  
                  <idno>C3-IV-156</idno>
                  <altIdentifier>
                     <repository ref="INS0004HMML"/>                     
                     <idno>GG 00006</idno>
                  </altIdentifier>
                  <altIdentifier>
                     <idno>GG-006</idno>
                  </altIdentifier>
                  <altIdentifier>
                     <idno>Schneider, no. 14</idno>
                  </altIdentifier>
                  <altIdentifier>
                     <idno>no. B12</idno>
                  </altIdentifier>
                  <altIdentifier>
                     <repository>Toronto Digital Library</repository>
                     <idno>61220/utsc35354</idno>
                  </altIdentifier>
               </msIdentifier>

and then also

<facsimile>
      <graphic url="https://w3id.org/vhmml/readingRoom/view/623103"/>
   </facsimile>
   <facsimile>
      <graphic url="https://ark.digital.utsc.utoronto.ca/ark:61220/utsc35354"></graphic>
   </facsimile>

that is, as visible from this case, (1) local shelfmark where available (2) institutional shelfmark (if available, in this case it is also a digital identifier) (3) any catalogue/handlist etc. numbers known (4) other shelfmarks attached to the MS (5) if existing, other digital identifiers

(6) links to all image sets available online

While this does collect all identifiers in one place it does not distinguish between shelfmarks of a manuscript and IDs of images and also, in case of multiple image sets, does not distinguish between same set of images and different sets of images.

In an exchange with Ted Erho he has suggested

PS - but as for the main identifiers, I would consistently use the following schema throughout BM
1) Any official library shelfmark is to be preferred, but this will really be limited to institutional collections in and outside the Horn of Africa (so, e.g. IES MS XXX, and not IES XXXXX)
2) Any 1980s inventory number with the three part identifier with -IV- in the middle if available
3) The first or best known image set identifier of a manuscript

which is indeed more or less what I have been doing, but we need to develop a way (adjust the schema) to distinguish the idnotypes and relations. (As of now we have not used MS as part of shelfmark unless imposed by the library.)

Any ideas and suggestions are welcome.

@abausi @thea-m @DenisNosnitsin1970 @CarstenHoffmannMarburg

eu-genia commented 6 months ago

We also have not developed a way to provide metadata specific to images in case of multiple (different) image sets available for the same manuscript

DenisNosnitsin1970 commented 6 months ago

"2) Any 1980s inventory number with the three part identifier with -IV- in the middle if available" - we should not overestimate the usefulness of this. These are shelfmarks that were assigned irregularly, usually to only a few items out of the collection, with the purpose not fully clear to me. The items of the collection should receive standard shelfmarks, and be more or less easily recognizable. It will be difficult to handle it all items have shelfmarks XXZZ and two will start IV-..... (you can see that this is convinient due to the fact no modern scholar uses anything except GG-... shelfmarks as the main identifier). Older shelflists from 80s/90s are welcome, but they cannot be used as main document in case the collection has been fully documented at a later point. However, apert from the main standard shelfmark, other identifiers can be used, of course.

eu-genia commented 6 months ago

The main point of this issue is that we have identifiers of manuscripts and identifiers of image sets which may or may not coincide, and the same ms may have different surrogates with different identifiers that may be different (taken at different times, with different media, e.g. microfilm, digital images by different persions and at different times, scans of microfilm, photocopies, etc etc, in which case the different IDs are justified but still must be somehow explained) or exactly the same (as the identical sets of images at EAP and at vHMML, in which case they are actually not justified, but they exist and must be taken into account - with marking somehow what should be the standard way of referring and which is an "error"/doublet)

DenisNosnitsin1970 commented 6 months ago

Sorry, from what you posted it is difficult to understand, where exacly the problem is which might concern/affects my personal work (?). But above, why <altIdentifier><repository>Toronto Digital Library</repository><idno>61220/utsc35354</idno>? is it of the same value as GG-006? B. 12 - I do not know what it is.

DenisNosnitsin1970 commented 6 months ago

Ok, I guess what is meant; but this is somehow clear fo me that the IDs for the sets of images and shelfmarks are not the same and the former should not be mistaken for the latter.

eu-genia commented 6 months ago

(1) there is a manuscript still preserved in Gunda Gunde, with a shelfmark C3-IV-156 on the board and other identifiers inserted locally (such as B12 in this case) (2) it was listed by Schneider as no. 14 (3) there is a digital copy of the manuscript - which is not the same as the manuscript - produced by MG and EBW and called at that time GG-006 (3a) these images are stored at HMML and catalogued by Ted Erho at vHMML https://w3id.org/vhmml/readingRoom/view/623103 as GG 00006 (3b) these very same images are also stored in Toronto at https://ark.digital.utsc.utoronto.ca/ark:61220/utsc35354; the digital identifier of this set (which is actually the same as 3a but stored on a different server) is listed in the metadata as 61220/utsc35354; the cataloguing metdata on the Toronto side has been produced by Witold Witakowski (4) in this case we have this one set of images twice, but there are also cases (Tanasee, IES, Abba Garima) when we have distinct surrogates for the same mss

for your personal work it may become important if you get to work with manuscripts where several sets of images are available or where there is a same set of images available in different places (e.g. Romanat which is ES but also EAP) - we must decide whether, and if yes how, we account for the existence of multiple surrogates, and how we distinguish whether the surrogates are different or identical

eu-genia commented 6 months ago

part of this has been discussed here https://github.com/BetaMasaheft/Documentation/issues/1349

we probably should put the identifiers of images under surrogates (after adminInfo) and not under msIdentifier

eu-genia commented 6 months ago

(Issue 1349 was closed by adding examples to the GL but not by updating the schema or the code)

eu-genia commented 6 months ago

The easiest way that does not require effort in changing the code is put all the information on the surrogates in the surrogatessection and possibly remove the IDs that refer to the digital objects and not the manuscript itself from msIdentifier

so we would have

<msIdentifier>
                  <repository ref="INS0005Gund"/>                  
                  <idno>C3-IV-156</idno>
                  <altIdentifier>
                     <repository ref="INS0004HMML"/>                     
                     <idno>GG 006</idno>
                  </altIdentifier>                  
                  <altIdentifier>
                     <idno>Schneider, no. 14</idno>
                  </altIdentifier>
                  <altIdentifier>
                     <idno>no. B12</idno>
                  </altIdentifier>                  
               </msIdentifier>

(or GG-006 can go first) and then after cataloguing information

<surrogates>
                     <list>
                        <item>digital images by Ewa Balicka-Witakowska and Michael Gervers <date when="2006">October 2006</date>
                           <list>
                              <item><idno>GG 00006 (HMML)</idno></item>
                              <item><idno>623103 (vHMML digital identifier)</idno>
                                 <ref type="mss"
                                    target="https://w3id.org/vhmml/readingRoom/view/623103"
                                    ></ref></item>
                              <item><idno>61220/utsc35354 (Toronto digital identifier)</idno><ref type="mss"
                                 target="https://ark.digital.utsc.utoronto.ca/ark:61220/utsc35354"
                                 ></ref></item>
                           </list>
                        </item>
                     </list>
                  </surrogates>

this way we can have as many items for different IDs as we need and as many items for different sets of images as we need.

What was discussed in #1349 was never implemented, for the time being the element biblcannot be used

eu-genia commented 6 months ago

If all agree I will add this to the Guidelines.

This also means that from now on, (1) if there are surrogates of a MS, we should always mention this in the surrogatessection (2) if there is an identifier of the surrogates this must be mentioned in the surrogates (3) the information on who and when created surrogates (microfilms, photocopies, digital images) should go into the surrogatessection (not as a changeelement, which should be reserved to the cataloguing record)

As always, this change would apply to all the new descriptions, the old ones can be revised whenever a chance comes up (i.e. if the record must be anyway updated for some reason)

Eventually it must be ensured that the shelfmark search also looks in the surrogates IDs

BetaMasaheft / Documentation

Distinguishing manuscript shelfmarks from digital object identifiers #2514