DCLP / dclpxsltbox

Sandbox for development, testing, and review of XSLT for DCLP
http://dclp.github.io/dclpxsltbox/
1 stars 5 forks source link

bad hyphen placement on word broken by line when there is added text at right #151

Open andreabernini opened 9 years ago

andreabernini commented 9 years ago

When the tag occurs in the middle of a word that continues in the following line, the display is not correct, in relation to the symbol -

P.Oxy. 75.5043 (TM 128970) Fr. 5,9-10 https://github.com/DCLP/idp.data/blob/master/DCLP/129/128970.xml

Leiden+:

9. [Φθίαι· Νεοπτόλεμος δ’ ἀ] ||right:.3 vac.2-3 .1[.?]||
10.- [πείρωι διαπρυσίαι,]

Displayed:

9 Φθίαι· Νεοπτόλεμος δ’ ἀ] (added at right:   ̣  ̣  ̣ vac.2-3   ̣[ -ca.?- ])-
10 [πείρωι διαπρυσίαι,]

Expected:

9 Φθίαι· Νεοπτόλεμος δ’ ἀ-] (added at right:   ̣  ̣  ̣ vac.2-3   ̣[ -ca.?- ])
10 [πείρωι διαπρυσίαι,]

The tag ‘left’ seems to be ok: in the same papyrus, Fr. 18 col. ii,11-12 https://github.com/DCLP/idp.data/blob/master/DCLP/129/128970.xml#l302

Leiden+:

11. ἐγγενὲς  ἔ(´)μμ[εν ἀεθλη]
12.- ||left:.1|| τ[αῖς ἀγα]θοῖσι[ν· ἐπεὶ]

Displayed:

11 ἐγγενὲς ἔ(*)μμ[εν ἀεθλη-]
12 (added at left:   ̣) τ[αῖς ἀγα]θοῖσι[ν· ἐπεὶ]
wsalesky commented 7 years ago

@paregorios There is a difference in how the XML is generated for the two examples listed above.

The correctly rendering xml (https://github.com/DCLP/idp.data/blob/master/DCLP/129/128970.xml#l302) has the <add place="left"> after the <lb>. In the problematic file (https://github.com/DCLP/idp.data/blob/master/DCLP/129/128970.xml#L302) the <add place="right"> comes before the <lb> if you change the location of the <add> element in the second instance it renders as expected.

We can, of course, adjust the XSLT to handle this use case, just want to make sure this is a desired difference in the data format, not a bug.

rla2118 commented 7 years ago

Sorry, I don't understand this question. @paregorios, can you clarify?

paregorios commented 7 years ago

@wsalesky thanks ... I'll undertake to look more closely at this one.

paregorios commented 7 years ago

So, there's a text encoding issue here. And I'd like opinion from @jcowey, @gabrielbodard, and @hcayless if possible.

Where in a physical line should the markup come for text added at left or right when a word is broken across two lines in the same place, but the addition is marginal and should not be read as part of the main text flow?

text added at right next to a wrapped word

P.Oxy. 75 5043, fragment 5, lines 9-10 The split word is ἀ|πείρωι, and some illegible characters (with whitespace between some of them) have been added to the right of the first line, such that one would expect to see something like the following in HTML output (note hyphen placement after Φθίαι· Νεοπτόλεμος δ’ ἀ):

[Φθίαι· Νεοπτόλεμος δ’ ἀ-] (added at right: ... vac. 2-3 . [ - ca. ? - ])
[πείρωι διαπρυσίαι,]

We've currently got the following XML, which the XSLT doesn't like and which I think is problematic:

<lb n="9"/><supplied reason="lost">Φθίαι· Νεοπτόλεμος δ’ ἀ</supplied> <add place="right"><gap reason="illegible" quantity="3" unit="character"/> <space atLeast="2" atMost="3" unit="character"/> <gap reason="illegible" quantity="1" unit="character"/><gap reason="lost" extent="unknown" unit="character"/></add>
<lb n="10" break="no"/><supplied reason="lost">πείρωι διαπρυσίαι,</supplied>

Here's what the XSLT gives us (note hyphen placement at end of first line), which I think is sorta sensible given the markup, but is not what the editor intends:

[Φθίαι· Νεοπτόλεμος δ’ ἀ] (added at right:   ... vac. 2-3 . [ -ca.?- ])-
[πείρωι διαπρυσίαι,]

If the editor wanted us to read the illegible, added characters at right as being the start of a word that wrapped to the next line, I think the above encoding would be right. But since it's a marginal annotation of some kind, what should we do?

paregorios commented 7 years ago

I note that the EpiDoc Guidelines don't consider this sort of edge case: http://www.stoa.org/epidoc/gl/latest/trans-addition.html

paregorios commented 7 years ago

So, I think this might be an edge case of a larger issue about marginalia: #121

paregorios commented 7 years ago

Blocked for resolution of #121