Closed ajaniak closed 3 months ago
Not as far as I know. Here are some proposals. Others may please add the scenarios that I am forgetting, and propose other manners for displaying gaps.
<gap reason="lost" extent="unknown" unit="character"/>
---> [? akṣ. lost]<gap reason="lost" quantity="8" unit="character"/>
---> [8 akṣ. lost]<gap reason="lost" quantity="8" unit="character" precision="low"/>
---> [ca. 8 akṣ. lost]<gap reason="illegible" quantity="8" unit="character"/>
---> [8 akṣ. illegible]<gap reason="undefined"/>
---> [lacuna]<gap reason="undefined" quantity="8" unit="character"//>
---> [lacuna of 8 akṣ.]<gap reason="undefined" extent="unknown" unit="character"//>
---> [lacuna of ? akṣ.]Decisions not yet made, indeed.
My suggestions:
For <gap unit="character">
When the quantity of characters is known:
When the quantity of characters is unknown: [...]
For <gap unit="line">
superscript, bold, between parentheses
with ? when precision is low.
(1 line lost) (1 line lost?) (line/lines lost) (line/lines lost?)
In the system Manu suggests, we could use + and ? to differentiate @reason="lost" and @reason="illegible" — I believe this is what is done in EIAD display (inspired by Buddhologists' conventions).
I would prefer if we always showed <gap>
in square brackets, restored text likewise, and the metre of a gap likewise. This would be consistent internally, and would also match the Leiden conventions, where square brackets always mean a lacuna. My specific suggestions:
<gap reason="lost" extent="unknown" unit="character"/>
---> [***]<gap reason="lost" quantity="8" unit="character"/>
---> [*8]<gap reason="lost" quantity="8" unit="character" precision="low"/>
---> [*8?]<gap reason="illegible" extent="unknown" unit="character"/>
---> [###]<gap reason="illegible" quantity="8" unit="character"/>
---> [# 8] <--- ignore the space, added in hopes of avoiding an auto-mention of that issue<gap reason="illegible" quantity="8" unit="character" precision="low"/>
---> [# 8?]<gap reason="undefined"/>
---> should not occur as per the EG: gap must always have unit and quantity or extent<gap reason="undefined" extent="unknown" unit="character"//>
---> [...]<gap reason="undefined" quantity="8" unit="character"/>
---> [.8]<gap reason="undefined" quantity="8" unit="character" precision="low"/>
---> [.8?]<lb/>
followed by an inline lacuna<seg type="component">
<seg>
has @met, then display as ⏑ / – depending on the @met value [which can only be + or -]<seg>
has NO @met, and the <gap>
has @subtype="vowel", display as ⏓<gap>
(with a @unit other than "component") is within <seg>
with @met, then instead of the above, display the value of @met converted to prosodic notation, in square brackets
<seg met="+++-++"><gap reason="lost" quantity="6" unit="character" /></seg>
displayed as [–––⏑––]I am fine with square brackets throughout.
I am fine (provisionally, for the purpose of proof-readings our encodings) with the proposals of Dan of using *, #, . depending of the nature of the gap. But for me, too much display renders editions somehow difficult to read. The nature of the gap is not a concern for me when first reading an inscription (and, if I need the information I will go to the XML file).
I prefer [ ] to [*8] as it visually gives an idea of the extent of the gap.
I find ⏓ for lost "vowel" misleading. For me, it refers to a short or long lost syllable.
I agree with Manu's views, notably on finding ⏓ for lost "vowel" misleading.
I'd be happy if we could make some gesture to the conventions of EIAD (and Vincent T. who is especiallt attached to them) by using + and ? rather than * and #.
Dan: could you attempt a new proposal taking Manu's and my reactions into account?
I am happy to make a gesture and vote for + and ? rather than * and #.
I can't harmonise the above suggestions with the ones made by me, because the changes you want break other parts of the scheme. I don't insist on * and # (though I personally am attached to them), and I also think Manu may have a point that we don't necessarily have to display the distinction between illegible and lost (and undefined). A tooltip can be added for that, so the user doesn't have to look up the XML. I'll list the specific problems and suggest some possible solutions. Please comment away, let's come closer to an agreement, and then I'll write up a full list of code/display pairs again.
Thanks a lot for these considerations. The problem re. "?" is indeed a significant obstacle. I have spoken with Vincent T. and he gives us carte blanche to come up with a coherent system, ideally one that has the greatest chance of being adopted by the greatest number of colleagues in our field(s).
I am inclined then to accept the core of what Dan has proposed, with the modification requested by Manu (the number of signs +/ corresponds to the value of @n
), and possibly with × instead of .
I don't have an alternative to offer for to ⏓ and the other prosodic symbols for lost vowels, so I suggest we retain them at least provisionally.
I'll wait a little to see if Manu and Annette want to offer more thoughts, then create a list.
Meanwhile, while working on the EG, I've realised that we'll also have <gap>
in the translation div, and those will need to be displayed differently. According to the Guide (and based on Arlo's suggestion), all lacunae in translations will be displayed as text in square brackets, e.g. [3 characters illegible], [3 characters lost], [3 lines lost]. Thus, in the translation div, a gap without attributes should be displayed as [...], and one with attributes should be composed on the basis of attribute values. If possible, line/lines and character/characters should be used depending on the @quantity.
Though the EG doesn't say so, perhaps some people will also put @precision in such gaps, in which case we need to add ca., e.g. [ca. 3 characters lost].
Dear colleagues, I do not have to offer any specific thoughts. Just two very minor questions: In the 3rd entry from top, Manu had written: [when the number of missing characters is known] "preceded and followed by space (even if we know the gap is inside a single word)". Has this been decided? I would prefer not to put space if the gap is inside a single word. Regarding Dániel's suggestion, lost syllable of unknown length is [⏓], while lost vowel of unknown length is ⏓ without brackets: Would it not make more sense the other way round?
Annette, I assume it was not your intention to close this issue, so I'm reopening it.
Good point about spacing around gap. I did not spot that in Manu's comments. As per the present EG, encoders are specifically instructer (§8.1/Editorial spaces and markup) to explicitly add spaces around <gap>
elements except where they meet a partially preserved word, in which case no space should be used. If the encoders can manage that, then indeed, the display of the <gap>
should not create any spaces, just preserve any space present in the XML around the element. However, if any of you think encoders can't be expected to be mindful of such things, it may simplify matters if we automatically displayed spaces around gap display - in that case there would be no way of indicating whether the adjacent word is complete or partial, but we can perhaps live with that. So please add your votes. My preference is to keep things as they are and not add space in gap display.
As to ⏓ with or without brackets, I think it makes good sense to display all "proper" lacunae (i.e. those affecting at least a full aksara) in square brackets. A single-syllable lacuna encoded with @met to show that its prosodic length is determined by verse but it happens to be an anceps will be very rare, perhaps as rare as a lost vowel of unknown length attached to a preserved. Most of the time, ⏓ in square brackets will occur as part of a longer sequence, e.g. [––⏑––⏑⏓]. Are you suggesting that lacunae with known metre should be shown without square brackets, e.g. ––⏑––⏑⏓? We could go that way, but I find that inconsistent; to my mind the following comprise a good class of display and should be handled similarly, in square brackets: [+++++++] seven lost characters, no information about them (note, I'm not explicitly endorsing the + sign here; my point is about the brackets) [––⏑––⏑⏓] seven lost characters to a given prosodic pattern [śārdūlavikrīḍitam] text lost and restored
Incidentally, my listing of bracketed stuff above has given rise to another display issue we should consider.
When <gap>
and/or śā<supplied reason="illegible">rdū</supplied><supplied reason="lost">la</supplied><gap reason="illegible" quantity="1" unit="character"/><gap reason="lost" quantity="2" unit="character"/>
display: śā[rdū][la][+][××]
or śā[rdūla+××]
?
yes, definitely, they need to be collapsed into a single set of brackets. from previous collaborations with Tom E. and Emmanuelle M., I have the impression that there are some complications, but it can be done.
Le 27 mars 2020 à 10:03, Dániel Balogh notifications@github.com<mailto:notifications@github.com> a écrit :
Incidentally, my listing of bracketed stuff above has given rise to another display issue we should consider.
When
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHubhttps://github.com/erc-dharma/project-documentation/issues/19#issuecomment-604891297, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAGMAEYRO6J45BVYB54GNRLRJRTUZANCNFSM4LUCZ52Q.
I am fine with:
I also agree to the points summarised by Manu.
I'll try now to summarise all points.
<gap reason="lost" extent="unknown" unit="character"/>
---> [...]<gap reason="lost" quantity="5" unit="character"/>
---> [+++++]<gap reason="lost" quantity="5" unit="character" precision="low"/>
---> [+++++?] or put the ? at the beginning of the string?<gap reason="illegible" extent="unknown" unit="character"/>
---> [...] note this is the same as unknown number of characters illegible. Nobody has suggested a way of distinguishing these, so I assume the distinction is not considered important.<gap reason="illegible" quantity="5" unit="character"/>
---> [×××××]<gap reason="illegible" quantity="5" unit="character" precision="low"/>
---> [×××××?] or put the ? at the beginning of the string?<gap reason="undefined"/>
---> should not occur as per the EG: gap must always have unit and quantity or extent<gap reason="undefined" extent="unknown" unit="character"//>
---> [...] note this is the same as unknown number of characters illegible. Nobody has suggested a way of distinguishing these, so I assume the distinction is not considered important.<gap reason="undefined" quantity="5" unit="character"/>
---> We have not discussed this; I assume you want to show in square brackets as many of some dedicated character as the @quantity. The dedicated character must be something other than .×+?, since the scheme uses those for other purposes<gap reason="undefined" quantity="5" unit="character" precision="low"/>
---> the above with a question mark<lb/>
followed by an inline lacuna<gap>
(with a @unit other than "component") is within <seg>
with @met, then instead of the above, display the value of @met converted to prosodic notation, in square brackets
<seg met="+++-++"><gap reason="lost" quantity="6" unit="character" /></seg>
displayed as [–––⏑––]<supplied>
element, join them in a single pair of square brackets<seg type="component">
<seg>
has @met, then display as ⏑ / – depending on the @met value [which can only be + or -]<seg>
has NO @met, and the <gap>
has @subtype="vowel", display as ⏓Points that still await a decision are:
@precision="low"
, do we put a ? at the beginning or end of the string of lacuna markers?OR, with apologies for being obstinate, we could go back to what I suggested above, changing * to + and # to ×. That way, we would have to relinquish showing as many lacuna markers as the number of characters lost, but everything else could be displayed in a consistent manner (except for the @reason of loss in sub-akṣara lacunae, which is hardly an important point). Displaying the size of lacuna with a numeral (instead of iterated signs) conforms to the Leiden conventions.
Dear Dan, Thanks for the summary.
On pending issues:
<gap reason="lost" quantity="5" unit="character" precision="low"/>
Dan's proposal ---> [+++++?] or put the ? at the beginning of the string?
I suggest [ca. +++++]
<gap reason="illegible" extent="unknown" unit="character"/>
---> [...]
Dan: note this is the same as unknown number of characters illegible.
You meant "same as unknown number of lost characters", I guess
Dan: Nobody has suggested a way of distinguishing these, so I assume the distinction is not considered important.
I am fine with no distinction.
<gap reason="illegible" quantity="5" unit="character" precision="low"/>
Dan: ---> [×××××?] or put the ? at the beginning of the string?
I suggest [ca. ×××××]
<gap reason="undefined" extent="unknown" unit="character"//>
Dan: ---> [...] note this is the same as unknown number of characters illegible. Nobody has suggested a way of distinguishing these, so I assume the distinction is not considered important.
I am fine with no distinction.
<gap reason="undefined" quantity="5" unit="character"/>
--->
Dan: We have not discussed this; I assume you want to show in square brackets as many of some dedicated character as the @quantity. The dedicated character must be something other than .×+?, since the scheme uses those for other purposes
I suggest [*****] or [#####]
<gap reason="undefined" quantity="5" unit="character" precision="low"/>
Dan: ---> the above with a question mark
I suggest [ca. *****] or [ca. #####]
when @Unit="line", display as text in square brackets:
precision="low"
.I approve of all of Manu’s responses and favor the * over the # (because, at least for English speakers, the sign # is intimately associated with the meaning ‘number’, which is not relevant in our context).
Arlo
Le 1 avr. 2020 à 17:10, manufrancis notifications@github.com<mailto:notifications@github.com> a écrit :
Dear Dan, Thanks for the summary.
On pending issues:
Dan's proposal ---> [+++++?] or put the ? at the beginning of the string? I suggest [ca. +++++]
Dan: ---> [×××××?] or put the ? at the beginning of the string? I suggest [ca. ×××××]
<gap reason="undefined" extent="unknown" unit="character"//> Dan: ---> [...] note this is the same as unknown number of characters illegible. Nobody has suggested a way of distinguishing these, so I assume the distinction is not considered important. I am fine with no distinction.
Dan: ---> the above with a question mark I suggest [ca. *****] or [ca. #####]
when @Unithttps://github.com/Unit="line", display as text in square brackets:
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHubhttps://github.com/erc-dharma/project-documentation/issues/19#issuecomment-607307110, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAGMAE4GY7XEHALIDVLC3DLRKNKPLANCNFSM4LUCZ52Q.
OK. Comments on Manu's response:
And so, to paraphrase you and summarise the answers to the issues I noted above as still open:
I'm essentially fine with that. I don't really like the "ca. +++++" style of display, but let's go ahead with that, and maybe we'll have a better idea by the time we come to website display. Let me also point out that by the present scheme, the @reason of a lacuna is distinguished in display when the length of the lacuna is known precisely or approximately, but it is not distinguished when the size is unknown, nor when it is smaller than one akṣara. That again is something I'm essentially fine with.
Sometime soon (probably tomorrow) I'll distil all this into a final summary, unless someone vetoes it in the meantime.
Righty-ho, onward to the <seg cert="low">last</seg>
summary.
<gap reason="lost" extent="unknown" unit="character"/>
---> [...]<gap reason="lost" quantity="5" unit="character"/>
---> [+++++]<gap reason="lost" quantity="5" unit="character" precision="low"/>
---> [ca. +++++]<gap reason="illegible" extent="unknown" unit="character"/>
---> [...]<gap reason="illegible" quantity="5" unit="character"/>
---> [×××××]<gap reason="illegible" quantity="5" unit="character" precision="low"/>
---> [ca. ×××××] <gap reason="undefined" extent="unknown" unit="character"/>
---> [...]<gap reason="undefined" quantity="5" unit="character"/>
---> [*****]<gap reason="undefined" quantity="5" unit="character" precision="low"/>
---> [ca. *****]<gap reason="lost" quantity="1" unit="line" precision="low"/>
---> [ca. 1 line lost] <gap reason="lost" quantity="2" unit="line" precision="low"/>
---> [ca. 2 lines lost] <gap reason="lost" extent="unknown" unit="line"/>
---> [unknown number of lines lost]<gap>
is not empty and contains a <certainty/>
element, add "possibly" to the text, e.g. [ca. 2 lines possibly lost] <lb/>
followed by an inline lacuna; however, an exact number of lines lost may be encoded with <precision>
inside. It may be a good idea to generate an error message if a <gap>
has @quantity (any value) and @unit="line" but neither has @precision nor contains <certainty/>
<gap>
(with a @unit other than "component") is within <seg>
with @met, then instead of the above, display the value of @met converted to prosodic notation, in square brackets
<seg met="+++-++"><gap reason="lost" quantity="6" unit="character" /></seg>
displayed as [–––⏑––]<supplied>
element, join them in a single pair of square brackets<seg type="component">
<seg>
has @met, then display as ⏑ if @met="-" and display as – if @met="+" [no other value of @met is permitted for a <seg type="component">
]<seg>
has NO @met, and the <gap>
has @subtype="vowel", display as ⏓What's new in the above:
sorry, I hadn’t had time to write yet: I am tempted to revert to a system (like that originally proposed by Dan, I think) of using a number (value of @quantity) plus a symbol for the type of gap, rather than the number of missing characters as such. The advantages of doing so, as requested by Manu, seem to me rather unimportant, esp. compared to the disadvantage of having enormous strings of + or * in the case of long gaps.
Le 2 avr. 2020 à 10:03, Dániel Balogh notifications@github.com<mailto:notifications@github.com> a écrit :
Righty-ho, onward to the
What's new in the above:
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHubhttps://github.com/erc-dharma/project-documentation/issues/19#issuecomment-607687128, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAGMAEY65H3MP33NN6472NTRKRBEZANCNFSM4LUCZ52Q.
OK, I'll wait then until this is settled. If I have a vote, mine is on displaying numbers instead of strings of characters. However, one thing we could do - perhaps not right now, but later on - is to include a parameter in the transformation that would let us toggle these alternatives.
Dear All,
Have you been able to make a final decision for the display of <gap/>
with symbols or numbers?
Thanks.
I am still in agreement with Daniel's for displaying the number of @quantity
instead of strings of characters. If that is the last point of diagreement, and Manu & Annette can bring themselves to agree with Daniel and me, I would suggest we ask Daniel to formulate once again the results of this discussion.
I can live ... for the moment ... with displaying the number of @quantity instead of strings of characters ... but might put the issue forward again when time will come to discuss the display on the website.
So, here is my attempt at a final recap. Actually, now that I think about it I'm afraid I can't distill a definitive consensus from the above, since all we've agreed on is that we incorporate some of my last summary into something along the lines of my first summary, using numbers instead of iterations of signs. But the level of incorporation is not certain, as for example in the "ca." suggested above by Arlo. So in the list below, I'll try to suggest alternatives. In all cases, bold face marks my preference, and the word QUESTION highlights items where we still need some discussion.
Can I have votes from the PIs for one of the options under each numbered item, and opinions on the questions?
<gap reason="lost" extent="unknown" unit="character"/>
<gap reason="lost" quantity="5" unit="character"/>
<gap reason="lost" quantity="5" unit="character" precision="low"/>
<gap reason="illegible" extent="unknown" unit="character"/>
<gap reason="illegible" quantity="5" unit="character"/>
<gap reason="illegible" quantity="5" unit="character" precision="low"/>
<gap reason="undefined" extent="unknown" unit="character"/>
<gap reason="undefined" quantity="5" unit="character"/>
<gap reason="undefined" quantity="5" unit="character" precision="low"/>
@unit="line"
, display as text in square brackets (I think we have consensus here, so no alternatives for this item)
<gap reason="lost" quantity="1" unit="line" precision="low"/>
---> [ca. 1 line lost] <gap reason="lost" quantity="2" unit="line" precision="low"/>
---> [ca. 2 lines lost] <gap reason="lost" extent="unknown" unit="line"/>
---> [unknown number of lines lost]@reason="illegible"
@reason="undefined"
I suggest the text "lost or illegible" added after the number of lines<gap>
is not empty and contains a <certainty/>
element, add "possibly" to the text, e.g. [ca. 2 lines possibly lost] <lb/>
followed by an inline lacuna; however, an exact number of lines lost may in principle be encoded with <certainty>
inside (e.g. "3 lines possibly lost"). It may be a good idea to generate an error message if a <gap>
has @quantity
(any value) and @unit="line"
but neither has @precision
nor contains <certainty/>
<gap>
(with a @unit
other than "component") is within <seg>
with @met
, then instead of the above, display the value of @met
converted to prosodic notation, in square brackets
<seg met="+-+"><gap reason="lost" quantity="3" unit="character" /></seg>
displayed as [–⏑–]@reason
here? We could use e.g. [+ –⏑–], [× –⏑–] and [ –⏑–] or [–⏑– +], [–⏑– ×] and [–⏑– ], but I'm afraid the +×* signs would create confusion next to the prosodic notation, so perhaps best not to.<seg>
with @met
, but not for <supplied>
<gap>
with @unit="component"
: display as follows, without square brackets (I think we have consensus here, so no alternatives for this item, but see the questions below)
<seg type="component">
(generate error message if it occurs anywhere else?)<seg>
has @met
, then display as ⏑ if @met="-"
and display as – if @met="+"
[no other value of @met
is permitted for a <seg type="component">
]<seg>
has NO @met
, and the <gap>
has @subtype="vowel"
, display as ⏓@reason="undefined"
and to make it clear that this display is for a lost segment@reason
in this case<gap>
with @unit="component"
as [.], since some of us are averse to prosodic notation in this caseDear Dan, thanks for this recap. Let me ponder a little more before I answer and vote.
So here is my vote, based on the following considerations :
Thus :
<gap reason="lost" extent="unknown" unit="character"/>
[...]
<gap reason="lost" quantity="5" unit="character"/>
[5+] (my favourite is still [+++++])
<gap reason="lost" quantity="5" unit="character" precision="low"/>
[ca. 5+]
<gap reason="illegible" extent="unknown" unit="character"/>
[...]
<gap reason="illegible" quantity="5" unit="character"/>
[5×]
<gap reason="illegible" quantity="5" unit="character" precision="low"/>
[ca. 5×]
<gap reason="undefined" extent="unknown" unit="character"/>
[...]
<gap reason="undefined" quantity="5" unit="character"/>
[5*]
<gap reason="undefined" quantity="5" unit="character" precision="low"/>
[ca. 5*]
when @unit="line"
Your proposition is fine with me.
One remark @ "if the <gap>
is not empty and contains a <certainty/>
element, add "possibly" to the text, e.g. [ca. 2 lines possibly lost]".
On what bears the certainty? (1) The number of lines lost or (2) the existence of the lacuna? The display "[ca. 2 lines possibly lost]" seems to correspond to (2).
when <gap>
(with a @unit other than "component") is within <seg>
with @met
Fine with me.
And OK for not reintroducing @reason
fusing sets of brackets. I would say option D (fuse brackets always). But I must admit that I do not see clearly what B and C imply.
<gap>
with @unit="component"
I would like to have square brackets here to
Thus: [⏑], [-], [⏓], [ ] OR [.]
Notes:
[⏓] might be confused with short or long syllable.
I like [.] instead of [ ], and would thus generalise it when no @met
Thus: [.] (whatever the component concerned; the reader will understand if it is a vowel part or a consonant and will refer to the XML for more details) except when there is @met, in which case [⏑] or [-].
Thanks, Manu. Some responses/clarification:
<certainty>
bears on (2). For an estimated number of lines you use @precision
on the <gap>
. I believe hardly anyone will ever use <certainty>
, but the facility is described in the EpiDoc guidelines and back last summer, Arlo thought it a good idea to include it in our guide just in case.Thanks, Dan!
@ 10. <certainty>
. Thanks for the clarification. 10 as you propose is fine with me.
@ 12. Noted. Let us see what Arlo had in mind.
@ 13. Noted. Thus I vote for [.] for all lost segments, regardless of whether it is a short vowel, a long one, a vowel of unknown length, or a consonant.
Further considerations:
<gap>
with @Unit="component"
I still vote for [.] for all lost segments, but like also Daniel's proposition: "[C] for lost consonant and [V], [V̄] and [V̆]". In any case I am in favour of using square brackets.It seems that [.] will be the best for lost segments, if Annette is set against C and V with markers. They will be rarely encoded, anyway, and . is traditional for them. I don't see a problem with [ca. 5+ ...] but I have no objection to separating brackets for extent="unknown". Manu, would you then also want to separate brackets for the segment notation [.]? But the main question about fusing is whether supplied text should be in the same set of brackets as lacunae, or separate from them.
OK, let us go with:
[.] for <gap> with @Unit="component"
(lost segments)
[...] when @extent="unknown"
These should never merge with any other similar brackets closeby.
As for the main question about fusing: "whether supplied text should be in the same set of brackets as lacunae, or separate from them". I would say separate.
That is fine by me. Arlo, could you speak up if this corresponds to your ideas? Annette, I assume that apart from the C/V notation, which we are now inclined to reject, you are OK with the system described above?
I will try to answer in the course of the day. A.
Le 4 juin 2020 à 10:05, Dániel Balogh notifications@github.com<mailto:notifications@github.com> a écrit :
That is fine by me. Arlo, could you speak up if this corresponds to your ideas? Annette, I assume that apart from the C/V notation, which we are now inclined to reject, you are OK with the system described above?
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHubhttps://github.com/erc-dharma/project-documentation/issues/19#issuecomment-638682090, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAGMAE7GGQ77RJLH6AIOB7LRU5IVNANCNFSM4LUCZ52Q.
Arlo, no particular hurry about this one. Meanwhile, Annette says:
As I am facing a Github problem right now, I am answering directly to you: Yes, I am OK with the system as it has been described now.
Sorry for having let this slip for so long.
In diagonally re-reading the thread, I appreciated Manu's comment "When extent="unknown", the display (whathever the reason) = “[...]”. The info for the reason of gap is in the XML. It is too much detail for me to display. The interested reader will check the XML." I thinbk this applies also to other situations where we may want to avoid overkill in use differentiated symboks and modes of display. I am thinking of #54, where @ajaniak as asked us to settle the issues concerning <gap>
.
I am too much out of it to have any opinion at this stage, and am certain I can live with the upshot of the exchange under the present issue mainly led by Dan and Manu from 19/05/2020 onward. @danbalogh: could you recap one list time, so @ajaniak can get to work?
OK, here's take n. It's actually pretty close to finished. Let us have a final yea or nay from all the PIs if possible. If any of you object to any of the solutions below, I would prefer that you A) suggested alternatives present in my previous list above, not brand new ones; and B) made sure that if you suggest a different solution for one of these, you check that the same modification of the method can be implemented in all related cases whilst not interfering with unrelated cases, so that the system as a whole remains coherent.
<gap extent="unknown" unit="character"/>
with any value of @reason
<gap reason="lost" quantity="5" unit="character"/>
<gap reason="lost" quantity="5" unit="character" precision="low"/>
<gap reason="illegible" quantity="5" unit="character"/>
<gap reason="illegible" quantity="5" unit="character" precision="low"/>
<gap reason="undefined" quantity="5" unit="character"/>
<gap reason="undefined" quantity="5" unit="character" precision="low"/>
@unit="line"
, display as text in square brackets:
<gap reason="lost" quantity="1" unit="line" precision="low"/>
---> [ca. 1 line lost] <gap reason="lost" quantity="2" unit="line" precision="low"/>
---> [ca. 2 lines lost] <gap reason="lost" extent="unknown" unit="line"/>
---> [unknown number of lines lost]@reason="illegible"
@reason="undefined"
I suggest the text "lost or illegible" added after the number of lines<gap>
is not empty and contains a <certainty/>
element, add "possibly" to the text, e.g. [ca. 2 lines possibly lost] <lb/>
followed by an inline lacuna; however, an exact number of lines lost may in principle be encoded with <certainty>
inside (e.g. "3 lines possibly lost"). It may be a good idea to generate an error message if a <gap>
has @quantity
(any value) and @unit="line"
but neither has @precision
nor contains <certainty/>
<gap>
(with a @unit
other than "component") is enclosed in <seg>
with @met
, then instead of the above, display the value of @met
converted to prosodic notation, in square brackets
<seg met="+-+"><gap reason="lost" quantity="3" unit="character" /></seg>
displayed as [–⏑–]@reason
in this case<gap>
with @unit="component"
: always display as [.] regardless of @reason
and any other factors
<seg type="component">
(generate error message if it occurs anywhere else?)<seg>
may or may not have @met
, and unlike 9 above, the presence of @met
shall not affect the display of this <gap>
I'm following up with another comment on the fusing of brackets, which is still a tangle.
So: fusing sets of brackets - we need to think about this. Initially we had consensus. Arlo (27 March) said "yes, definitely, they need to be collapsed into a single set of brackets." Manu (27 March) said "I am fine" [with this] and Annette (28 March) said "I also agree to the points summarised by Manu". Then it seems I stirred up the soup by claiming (21 May) that "Arlo mentioned that he would not want to include supplied text in the same set of brackets". Frankly, I see no such statement by Arlo in this thread; he may have said that over Skype to me, or it may be a figment of my imagination, for which I apologise. In any case, hearing this, Manu (3 June) retracted his earlier opinion and added that [...] (for a gap of unknown length) should never be fused, then slightly later added that [.] (for sub-akṣara sized gaps) should also not be fused to anything else. So it seems that the idea of not fusing supplied to illegible may have been instigated by me, but the question of whether we want to fuse [...] and [.] to anything else is still a question. Thus, to illustrate with a hypothetical case ad absurdum:
śā<supplied reason="illegible">rdū</supplied><supplied reason="lost">l</supplied><seg type="component" subtype="vowel"><gap reason="lost" quantity="1" unit="component"/></seg><gap reason="illegible" quantity="1" unit="character"/><gap reason="lost" quantity="2" unit="character" precision="low"/><gap reason="illegible" quantity="3" unit="character"/><gap reason="lost" extent="unknown" unit="character"/><supplied reason="lost">brā</supplied>hmaṇasya
What display do we prefer?
@reason
s of <supplied>
("rdūl") but nothing else@reason
s of <supplied>
and all kinds of <gap>
, but not these two to each otherMy preference definitely seems to be for 4, i.e. to fuse everything (provided that we don't make a distinction in the display of <supplied>
depending on @reason
), except that the notation with "ca." looks bad when displayed like this. So how about reverting, after all, to my earlier suggestion of [?5+] instead of [ca. 5+] (and likewise for illegible and undefined)? That would give us the display śā[rdūl. 1× ?2+ 3× ... brā]hmaṇasya, which I think is as clear as we can get. Notice that in the fused display I'm using spaces in place of the brackets, which should definitely be done to keep the items separate. However, if we do fuse [.] with other stuff in brackets, then this should not be spaced when it is next to a <supplied>
element, only when it is next to <gap>
(to avoid the display śā[rdūl . 1× ?2+ 3× ... brā]hmaṇasya).
If you are dead set against the ?, then perhaps the "ca." could be shown without spacing to keep the items together: śā[rdūl. 1× ca.2+ 3× ... brā], but this doesn't look very good to me.
And of course, the entire bracket fusion issue rests on whether Axelle can do the wizardry required to implement it. It may be best, for the time being, to forget about fusion altogether and just stick to śā[rdū][l][.][1×][ca. 2+][3×][...][brā]hmaṇasya in our transformations, and to come back to it when we come to website display.
Thanks Dan. I am in agreement with all your proposals in these last two messages, including preference for number 4 (fusing all brackets) and reverting to "?" to reflect @precision="low".
Thanks, that is great. I understand Manu is on holiday; can we expect him to give a final OK nonetheless or should we assume that he'll accept whatever we agree on? @AnneSchmiedchen - please confirm if you agree. I'll be happy to write one more recap to include all these things, but I'd like to wait until we have consensus before I do that.
I agree. And many thanks for all this.
I agree (provisionally ;-) with Dan's recap fo <gap>
. (I am not in favour of the distinction×, +, *
and prefer to let the user check the XML rather than to obscure the edition; I might put this again on the table later).
As for the fusion of brackets, I also agree provisionally. Thus option 4 śā[rdūl. 1× ca. 2+ 3× ... brā]hmaṇasya
and using ?
instead of ca.
is OK for me for the moment. Again, I might put this again on the table later.
So, Axelle, could you, please, implement the final recap that Dan will prepare?
Just one more thing. Given that we seem to be going for simplifying the display in other areas as well (e.g. space), I'm perfectly OK with using just a + sign instead of × and * for gaps of all reasons. We've already discarded distinction by reason in supplied text, so why not? If @arlogriffiths and @AnneSchmiedchen agree to that, then Manu will not need to put this on the table later and we can just go ahead with the way he prefers.
Oops, slight problem with this. What do we do if two different kinds of gap are next to each other, as in my example above? For things like <gap reason="illegible" quantity="1" unit="character"/><gap reason="lost" quantity="2" unit="character" precision="low"/><gap reason="illegible" quantity="3" unit="character"/>
do we want to display [1+ ?2+ 3+ ...] If not, then what, given that one of the items is imprecise but the other two are. But even then, if we don't have the imprecision, as in <gap reason="illegible" quantity="1" unit="character"/><gap reason="lost" quantity="2" unit="character"/><gap reason="illegible" quantity="3" unit="character"/>
, we would then probably want [6+] instead of [1+ 2+ 3+]. Perhaps best to stick to what I have above and indeed, put this on the table later if at all.
I agree with Manu's (repeated) intervention to use just a + sign for gaps of all reasons. And I am still for the fusion of brackets and for the use of a ? sign instead of "ca." If we do not have any imprecision, I would also prefer [6+] instead of [1+ 2+ 3+].
For now, the above has been applied.
Please note that I had to deleted any html elements used to structure the gap in order to allow the fusion (remember that it will work with <supplied>
only with the @reason="lost"
and @reason="subaudible"
. Same as for fusion of the <unclear>
, the grantha rendering will mostly mess it up).
I also had to delete the provision made by Epidoc that if two <gap>
with the same @reason
are after each other only the first one is displayed.
I will wait your final decisions on the others matter of discussion.
To sum up the issues that remain, before I write a final recap:
@reason
in gaps of known/estimated size or not? Manu and Annette are for not doing so, I'm OK with it if we can solve the issue of merging display of side-by-side gaps (e.g. 3 characters lost and 3 characters illegible should not display as [3 3] but as [6]); @arlogriffiths has not yet expressed an opinion.depending on the outcome of that,
A. if we keep the distinction by @reason
, then fusion can be implemented between all things displayed in square brackets with a little trickery concerning spaces, which I'll summarise again if we come to this decision. Everybody has said they prefer full fusion (i.e. śā[rdūl. 1× ca. 2+ 3× ... brā]hmaṇasya
), so we have consensus here IF the distinction by reason is kept.
B. if we discard distinction by @reason
, we need to know if it is technically possible to add the @quantity
numbers of successive gaps for display, and we need to come to a solution for cases where some of those successive gaps have @precision="low"
: do we then add the quantites and put a ? before the sum, or do we keep the numbers separate, with ? only in front of the imprecise ones?
And one more issue that @ajaniak has just reminded me of by the above post. None of the above details mention <supplied reason="subaudible">
, which we now use for two things: editorial avagrahas and editorial punctuation. I think it would be best to display these differently from the restoration of lacunae, so I suggest that they should not be in square brackets. Instead, they could be displayed in the same way as <supplied reason="omitted">
, and perhaps also merged into the same set of brackets as any adjacent restored omissions.
Finally, a note to @ajaniak : none of our files should have two <gap>
elements with the same @reason
after each other. If in such a case the standard EpiDoc transformation silently displays only the first one, it's OK if you override that, but it would be best if an error message could also be generated in such a case. The only exception I can imagine is if one of these contains <certainty>
, in a hypothetical situation where the rest of an inscription is broken off before the end of a stanza. In that case you would know that the inscription quite certainly contained as many characters as needed to finish the stanza, but you may be unsure whether or not it contained any other text after that, so you would mark up a lacuna of known size as lost, followed by a lacuna of unknown size as possibly lost.
Dear all,
Could you tell if you have already made decisions regarding ?
Thanks, Best