Open yellwork opened 7 years ago
Whereas
<lb n="161744"/>admiration of Rossini's <emph>Stabat Mater</emph>, a work simply abounding in
Is just an instance of <title>
and not also <foreign xml:lang="la">
?
I like the embedded <quote>
and <foreign>
above, with "All'erta". I agree with you about "Stabat Matter," too--I think just <title>
is fine with that one.
Is it worth indicating that the bare words of such titles as encountered in the episode are also in Latin &c.? I’m looking through ‘Proteus’ here. On an earlier pass for <emph>
disambiguation, you rendered a few likely instances of <title>
as <foreign>
:
<lb n="030167"/> […] But he must send me <foreign xml:lang="fr">La Vie de Jésus</foreign> by M. Léo Taxil.
<lb n="030196"/>[…] Rich booty you brought back; <foreign xml:lang="fr">Le
<lb n="030197"/>Tutu</foreign>, five tattered numbers of <foreign xml:lang="fr">Pantalon Blanc et Culotte Rouge</foreign>;
Gotcha moment aside (!), this is valuable information that we don’t want clipped in the shift to <title>
. How about something like the following?
<lb n="030167"/> […] But he must send me <title type="book" xml:lang="fr">La Vie de Jésus</title> by M. Léo Taxil.
I note, in passing, that there’s also a case to be made for marking up the remainder of the sentence as by <foreign xml:lang="fr" rend="none">M.</foreign> Léo Taxil
. (Drawing on our discussion in #2.)
I like that syntax of embedding the language in the tag. Let's do it.
And thanks for catching those mistakes! I've just corrected them, using your suggested syntax.
I was just finishing the @said
tagging for “Lestrygonians” when I spotted something in the earlier encoding that gave me pause:
<p><lb n="081039"/>He hummed, prolonging in solemn echo the closes of the bars:
<lb n="081040"/><said who="Leopold Bloom">―<foreign xml:lang="it">Don Giovanni, a cenar teco
<lb n="081041"/>M'invitasti.</foreign></said></p>
[...]
<lb n="081051"/><said who="Leopold Bloom">―<foreign xml:lang="it">A cenar teco.</foreign></said></p>
<p><lb n="081052"/>What does that <foreign xml:lang="it">teco</foreign> mean? Tonight perhaps.
<lb n="081053"/><said who="Leopold Bloom">―<emph>Don Giovanni, thou hast me invited
<lb n="081054"/>To come to supper tonight,
<lb n="081055"/>The rum the rumdum.</emph></said></p>
<p><lb n="081056"/>Doesn't go properly.</p>
Really, these instances of <foreign>
should all be <quote xml:lang="it">
, shouldn’t they? I proposed a double encoding – <quote><foreign>
– at the head of this issue, but I’m starting to think <quote xml:lang="it">
(like <title xml:lang="fr">
above) would be neater. What’s anyone else’s sense? This would probably require us to rework a lot of the Latin in the book, <foreign xml:lang="la">
, as quotation too: <quote xml:lang="it">
. See the first line of dialogue, for example. For:
<lb n="010005"/><said who="Buck Mulligan">―<foreign xml:lang="la">Introibo ad altare Dei.</foreign></said></p>
read
<lb n="010005"/><said who="Buck Mulligan">―<quote xml:lang="la">Introibo ad altare Dei.</quote></said></p>
I’m happy to make these changes, but I wanted to run the proposal by the group first. I’m sure if we make our encoding decisions clear in the README, tools like your foreign-language analysis can be tailored to catch non-English quotations, right, Jonathan?
This sounds great. I think <quote>
isn't rendered as italicized by default, though, so if we merge contiguous <quote>
and <foreign>
, we should probably add @rend
, like <quote xml:lang="la" rend="italics">
to preserve the rendering as italicized.
And yep, this won't make too much of a difference in analyses, since we can just look for @xml:lang
instead of foreign
.
Here is an interesting example of typographic distinction opening up into multiple possibilities for tagging:
@JonathanReeve switched the inherited
<emph>
tagging for a<foreign xml:lang="it">
. But the italics also render a quotation (not that every quotation is so distinguished!). Gifford has:Is this then
Are there other examples in this vein?