faustedition / faust-xml

XML and other source data of the Faustedition
8 stars 2 forks source link

sp without actual speech act → stage #607

Open thvitt opened 6 years ago

thvitt commented 6 years ago

In a situation like this:

<sp>
    <speaker rend="centered bold">Manto.</speaker>
    <l>Den lieb’ ich, der Unmögliches begehrt.</l>
</sp>
<sp>
    <speaker rend="centered bold">Chiron</speaker>

</sp><stage rend="centered small">(iſt ſchon weit weg).</stage>

validation complains about the <sp> that contains only <speaker>Chiron</speaker>:

14382:34 element "sp" incomplete

expected element "ab", "addSpan", "alt", "altGrp", "anchor", "app", "camera", "caption", "cb", "certainty", "cit", "damageSpan", "delSpan", "figure", "floatingText", "fw", "gap", "gb", "index", "interp", "interpGrp", "join", "joinGrp", "l", "lb", "lg", "link", "linkGrp", "list", "listApp", "listEvent", "listNym", "listOrg", "listPerson", "listPlace", "listRelation", "listTranspose", "listWit", "metamark", "milestone", "move", "notatedMusic", "note", "p", "pb", "precision", "q", "quote", "respons", "said", "sound", "space", "span", "spanGrp", "stage", "substJoin", "table", "tech", "timeline", "view" or "witDetail"

This should instead be converted to this:

<sp>
    <speaker rend="centered bold">Manto.</speaker>
    <l>Den lieb’ ich, der Unmögliches begehrt.</l>
</sp>
<stage rend="centered bold"><hi>Chiron</hi></stage>
<stage rend="centered small">(iſt ſchon weit weg).</stage>

i.e., <sp> that do not contain actual speech should be <stage>, with the <speakers> inside transformed to <hi>.

Am I right, @gerritbruening?

thvitt commented 6 years ago

is this transformation correct? old:

</sp>
<sp>
    <speaker n="after_4756" rend="centered letter-spaced">Mephiſtopheles</speaker>
    <stage n="after_4756" rend="centered small">(ſteigt hinauf und ſtellt ſich zur
    Linken).</stage>
</sp>
<sp>

new:

</sp>

       <stage n="after_4756" rend="centered letter-spaced"><hi>Mephiſtopheles</hi></stage>

   <stage n="after_4756" rend="centered small">(ſteigt hinauf und ſtellt ſich zur
           Linken).</stage>
<sp>
gerritbruening commented 6 years ago

Is that Q or something? The desired output would look like this:

                            <stage n="after_4756" rend="centered small">
                                <hi rend="normal letter-spaced">Mephiſtopheles</hi>
                                <lb/>(ſteigt hinauf und ſtellt ſich zur Linken).</stage>

(I know ...)

thvitt commented 6 years ago

Where’s the <stage> (or <stages>)? How on earth to infer <lb/> or the different values for @rend? There is not enough information to get to that desired encoding. Please keep in mind that the transformation is intended to run on all source files, and it is producing the new source files, so I would like to avoid overly complex (and thus error-prone) transformations …

gerritbruening commented 6 years ago

Where’s the <stage> (or <stages>)?

Sorry, see edited snippet above.

How on earth to infer <lb/> or the different values for @rend? There is not enough information to get to that desired encoding.

(I know ..., and I understand when you prefer not to do these changes automatically. Again, where do you find these cases? Only C.41 and Q?

thvitt commented 6 years ago

when removing the sp-without-l-or-p-handling altogether, I get incomplete sps here:

element "sp" incomplete

expected element "ab", "addSpan", "alt", "altGrp", "anchor", "app", "camera", "caption", "cb", "certainty", "cit", "damageSpan", "delSpan", "figure", "floatingText", "fw", "gap", "gb", "index", "interp", "interpGrp", "join", "joinGrp", "l", "lb", "lg", "link", "linkGrp", "list", "listApp", "listEvent", "listNym", "listOrg", "listPerson", "listPlace", "listRelation", "listTranspose", "listWit", "metamark", "milestone", "move", "notatedMusic", "note", "p", "pb", "precision", "q", "quote", "respons", "said", "sound", "space", "span", "spanGrp", "stage", "substJoin", "table", "tech", "timeline", "view" or "witDetail"

    print/J_XIIA149-1833.xml: 1×: 1133:38
    print/Q_IIIB31-3.xml: 52×: 920:34 1665:34 3027:34 3336:34 3916:34 4612:34 4712:34 4805:34 5303:34 5375:34 6101:34 6166:34 6277:34 6777:34 8720:34 9491:34 10196:34 10205:34 10354:34 11015:34 12064:34 12098:34 12336:34 12475:34 13761:34 13888:34 13953:34 14382:34 15679:34 15828:34 15916:34 16086:34 16505:34 16995:34 17439:34 18341:34 19076:34 19135:30 19139:30 19143:30 19147:30 19760:34 19765:34 20512:34 20516:34 20644:34 20718:34 20955:34 21027:34 21258:34 22243:34 22542:34
    print/C(3)41_IIIB27.xml: 35×: 372:14 1101:14 1108:14 1254:14 1920:14 2994:13 3024:13 3268:13 3412:13 4738:13 4868:13 5373:13 6675:14 6912:13 7088:13 7502:13 7995:13 8436:13 9340:13 10066:13 10123:13 10127:13 10131:13 10135:13 10775:14 10781:14 11347:14 11546:13 11551:13 11676:13 11752:13 11992:13 12304:13 13473:13 13625:13
    transcript/bb_cologny/G-30_08/G-30_08.xml: 1×: 375:26
    transcript/location_unknown/cohen_catalog97-99/cohen_catalog97-99.xml: 1×: 332:18
    transcript/fdh_frankfurt/Hs-29527/Hs-29527.xml: 1×: 385:30
    transcript/gsa/GSA_27-26/GSA_27-26.xml: 1×: 310:18
    transcript/gsa/391098/391098.xml: 1×: 5393:27
    transcript/gsa/390507/390507.xml: 1×: 448:22
    transcript/gsa/391437/391437.xml: 1×: 533:18
    transcript/gsa/390437/390437.xml: 1×: 390:18
    transcript/gsa/391247/391247.xml: 2×: 494:30 3105:34
    transcript/gsa/391087/391087.xml: 3×: 456:18 1425:18 2057:18
    transcript/gsa/390275/390275.xml: 1×: 461:18
    transcript/gsa/390438/390438.xml: 1×: 437:22
    transcript/gsa/390395/390395.xml: 1×: 519:22
    transcript/gsa/391362/391362.xml: 1×: 494:18

expected element "ab", "addSpan", "alt", "altGrp", "anchor", "app", "camera", "caption", "cb", "certainty", "cit", "damageSpan", "delSpan", "figure", "floatingText", "fw", "gap", "gb", "index", "interp", "interpGrp", "join", "joinGrp", "l", "lb", "lg", "link", "linkGrp", "list", "listApp", "listEvent", "listNym", "listOrg", "listPerson", "listPlace", "listRelation", "listTranspose", "listWit", "metamark", "milestone", "move", "notatedMusic", "note", "p", "pb", "precision", "q", "quote", "respons", "said", "sound", "space", "span", "spanGrp", "speaker", "stage", "substJoin", "table", "tech", "timeline", "view" or "witDetail"

    transcript/gsa/391473/391473.xml: 2×: 2176:26 3081:26
    transcript/gsa/390825/390825.xml: 1×: 1772:26
gerritbruening commented 6 years ago

when removing the sp-without-l-or-p-handling altogether, I get incomplete sps here: ...

to be fixed manually, I guess?

thvitt commented 6 years ago

äh, was muss ich hier tun?

gerritbruening commented 6 years ago

Nix, es sei denn du hast doch Lust auf und Zeit für die in https://github.com/faustedition/faust-xml/issues/607#issuecomment-420828851 gesagten Dinge.