erc-dharma / project-documentation

DHARMA Project Documentation
Creative Commons Attribution 4.0 International
3 stars 3 forks source link

Cheatsheet: <gap/> #19

Closed ajaniak closed 5 months ago

ajaniak commented 4 years ago

Dear all,

Could you tell if you have already made decisions regarding ?

Thanks, Best

danbalogh commented 4 years ago

This thread seems to have slowed down again. I need final opinions on the issues summarised in my above comment, mainly from @arlogriffiths , before I can write up the final verdict.

arlogriffiths commented 4 years ago

Dear @danbalogh : sorry, I am spending these last two weeks of the French summer vacation with my son and mother in the Auvergne, and have only very limited time to work. Now I have a few minutes, so here goes.

  1. Some scholars (e.g. Buddhologists influenced by the tradition of scholarship on Gilgit and Central Asian mss.), among them our team member Vincent Tournier, are very attached to being able to distinguish between akṣaras that are "lost" and akṣaras that are "illegible". As much as I would other like to join the consensus reached by Manu and Annette, I feel we should at least go through a phase of testing a display that more faithfully corresponds to the way we encode, before possibly deciding to marge various values of @reason in display. If @ajaniak in her message of 12/08 meant that the display currently put in place indeed makes those distinctions for @reason, then that agrees with my proposal to test this approach (and I am confident that neither Annette nor Manu would disagree with at least trying out this kind of display).

  2. In that scenario, we thus move to Daniel's next step A: "fusion can be implemented between all things displayed in square brackets with a little trickery concerning spaces" and "full fusion (i.e. śā[rdūl. 1× ca. 2+ 3× ... brā]hmaṇasya)" — although I think that last example given by Daniel in his message of 12/08 should become "full fusion (i.e. śā[rdūl. 1× ?2+ 3× ... brā]hmaṇasya)", with "?" instead of "ca.".

  3. For <supplied reason="subaudible"> used for editorial avagrahas and editorial punctuation, I suggest they could be displayed with a striking color, although display in the same way as <supplied reason="omitted"> would also be fine with me. If I am right to suspect that the former solution (with color) would remove the need to apply any merging and if this would simplify Axelle's work a lot, then perhaps that would be an argument in favor of use of color.

  4. I support Daniel's request to Axellle to put in place an error message when the XML file contains two <gap> elements with the same @reason after each other and there is no @certainty in either one.

@danbalogh : please let me know if you need any more input from my side.

AnneSchmiedchen commented 4 years ago

Although I have not been directly addressed now: I fully agree that we should go through a phase of testing for the distinction in display between "lost" and "illegible".

arlogriffiths commented 4 years ago

Thanks. And I think we can infer from Manu’s previous messages that he, too, is not opposed to such a testing phase.

So, Dan and Axelle, please do the needful.

Le 21 août 2020 à 13:38, AnneSchmiedchen notifications@github.com<mailto:notifications@github.com> a écrit :

Although I have not been directly addressed now: I fully agree that we should go through a phase of testing for the distinction in display between "lost" and "illegible".

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/erc-dharma/project-documentation/issues/19#issuecomment-678246973, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAGMAEYBBPWLPJUDR5COKSLSBZMENANCNFSM4LUCZ52Q.

danbalogh commented 4 years ago

Thanks for all the input on this; Arlo, I appreciate your taking the time. Here goes:

  1. <gap extent="unknown" unit="character"/> with any value of @reason --> [...]

  2. <gap> with @quantity and @unit="character", when not enclosed in <seg> with @met -- > show a ? if @precision="low" is present, then display the value of @quantity as a number, followed by a +/×/* mark for @reason "lost"/"illegible"/"undefined" respectively, viz.

    • <gap reason="lost" quantity="5" unit="character"/> --> [5+]
    • <gap reason="lost" quantity="5" unit="character" precision="low"/> --> [?5+]
    • <gap reason="illegible" quantity="5" unit="character"/> --> [5×] <gap reason="illegible" quantity="5" unit="character" precision="low"/> --> [?5×]
    • <gap reason="undefined" quantity="5" unit="character"/> --> [5*]
    • <gap reason="undefined" quantity="5" unit="character" precision="low"/> --> [?5*]
  3. <gap> with @unit="character" when enclosed in <seg> with @met --> instead of 2 above, display the value of @met converted to conventional prosodic notation, in square brackets (no distinction by @reason in this case)

    • e.g. <seg met="+-+"><gap reason="lost" quantity="3" unit="character" /></seg> displayed as [–⏑–]
  4. <gap> with @unit="line" --> display as text in square brackets: start with "ca. " if @precision="low" is present, then show the value of @quantity as a number or show "unknown number of" if @extent="unknown" is present instead of @quantity, followed by "line" or "lines" (depending on the value of @quantity), followed by "possibly" if the <gap> contains a <certainty/> element, followed by "lost"/"illegible"/"lost or illegible" for @reason "lost"/"illegible"/"undefined", viz. e.g.

    • <gap reason="illegible" quantity="1" unit="line" precision="low"/> --> [ca. 1 line illegible]
    • <gap reason="undefined" quantity="2" unit="line" precision="low"/> --> [ca. 2 lines lost or illegible]
    • <gap reason="lost" extent="unknown" unit="line"/> --> [unknown number of lines lost]
    • <gap reason="undefined" quantity="2" unit="line" precision="low"/> --> [ca. 2 lines possibly lost]
  5. <gap> with @unit="component" (nb, always within <seg type="component">): always display as [.] regardless of @reason and any other factors

    • the notation [.] will be understood to mean "one vowel, consonant or conjunct consonant lost or illegible"
    • the enclosing <seg> may or may not have @met, and unlike 9 above, the presence of @met shall not affect the display of this <gap>
  6. brackets for the display of lacunae and their restorations shall always be fused; specifically, fuse into a single set of brackets (while retaining a space between the elements in a single set of brackets) the display of each of the following:

    • <supplied> with @reason "lost" and "illegible" (but NOT @reason "omitted" or "subaudible")
    • <gap> with @unit "character" or "component" with any value of @reason (but not @unit="line", which should be kept separate)
    • thus, for the extreme example śā<supplied reason="illegible">rdū</supplied><supplied reason="lost">l</supplied><seg type="component" subtype="vowel"><gap reason="lost" quantity="1" unit="component"/></seg><gap reason="illegible" quantity="1" unit="character"/><gap reason="lost" quantity="2" unit="character" precision="low"/><gap reason="illegible" quantity="3" unit="character"/><gap reason="lost" extent="unknown" unit="character"/><supplied reason="lost">brā</supplied>hmaṇasya --> display śā[rdūl. 1× ?2+ 3× ... brā]hmaṇasya
  7. generate error messages in the following conditions

    • if a <gap> has no @reason (this is already mandatory in the EpiDoc schema)
    • if a <gap> has no @unit (in the EpiDoc schema, @unit is not mandatory if @extent is present, but in our practice it is always mandatory)
    • if a <gap> has @quantity (any value) and @unit="line" but neither has @precision nor contains <certainty/> (because by our EGD, if the number of lines lost is known precisely, then the encoding is with iterated <lb/>; however, an exact number of lines lost may in principle be encoded as possibly lost with <certainty> inside (e.g. "3 lines possibly lost")
    • if a <gap> with @unit="component" occurs anywhere except inside <seg type="component">
    • if two <gap>s with the same @reason and the same @unit follow one another with nothing or only white space in between, unless one (and only one) of the two contains <certainty>

Corollary: <supplied> with @reason="subaudible" should NOT be displayed using square brackets and should NOT be fused with any of the above items; instead, display it in angle brackets just like <supplied> with @reason="omitted", and use a distinctive font colour on it (@arlogriffiths : I'm not sure this is what you meant. If you meant that you would prefer no brackets around these, only colour, then that is fine by me too. I am certainly very, very strongly against displaying them in the same way as lacunae and restored lacunae, especially because it can sometimes happen that you restore a putative original avagraha or punctuation mark. But this has no direct bearing on the present issue, so we can discuss it at leisure anytime later)

Note to Arlo on colleagues attached to the distinction between lost and illegible: sure, I appreciate that and since EpiDoc gives us the means, I'm happy to do it. But keep in mind that we've already discarded that distinction in display for 1) restored text and 2) lacunae affecting only part of an akṣara, the justification being that if anyone is interested, they can look at the XML.