ReadAlongs / Studio-Web

Suite of web packages for creating interactive ReadAlongs
https://readalong-studio.mothertongues.org/
Other
11 stars 9 forks source link

Cannot download Elan, Praat, SRT or WebVTT from Editor #347

Closed joanise closed 1 month ago

joanise commented 1 month ago

If you create a read-along in the Studio, all six download formats work fine.

If you then reload any read-along in the Editor, html and zip downloads work, but the other four formats give a 422 with this error:

{
    "detail": "ReadAlong provided is not valid: <string>:1:0:ERROR:VALID:DTD_UNKNOWN_ATTRIBUTE: No declaration for attribute xmlns of element read-along"
}

as seen by going F12 / Network / <pick download event with the 422> / Response

joanise commented 1 month ago

Possibly relevant, the payload that Studio sends to the /convert_alignment/eaf endpoint looks like this:

<read-along version="1.2">

    <meta name="generator" content="@readalongs/studio (cli) 1.1.0" id="m01"/><meta name="generator" content="@readalongs/studio-web 1.5.0"/><text xml:lang="und" fallback-langs="und" id="t0">
        <body id="t0b0">
            <div type="page" id="t0b0d0">
                <p id="t0b0d0p0">
                    <s id="t0b0d0p0s0"><w id="t0b0d0p0s0w0" ARPABET="AA S D F" time="0" dur="0.88">asdf</w></s>
                </p>
            </div>
        </body>
    </text>
</read-along>

but the payload the editor sends has a few differences:

<read-along xmlns="http://www.w3.org/1999/xhtml" version="1.2">

    <meta name="generator" content="@readalongs/studio (cli) 1.1.0" id="m01"><body id="t0b0"></body></meta><meta name="generator" content="@readalongs/studio-web 1.5.0" /><text xml:lang="und" fallback-langs="und" id="t0">

            <div type="page" id="t0b0d0">
                <p id="t0b0d0p0">
                    <s id="t0b0d0p0s0"><w id="t0b0d0p0s0w0" arpabet="AA S D F" time="0" dur="0.88">asdf</w></s>
                </p>
            </div>

    </text>
</read-along>
joanise commented 1 month ago

@deltork Maybe you know how to address this?

joanise commented 1 month ago

The actual error messages seems to be due to xmlns on the read-along element, which our DTD 1.2 does not allow.

I expect there will also be an error from

<meta name="generator" content="@readalongs/studio (cli) 1.1.0" id="m01"><body id="t0b0"></body></meta>

the body tag is not supposed to be inside the meta tag.

deltork commented 1 month ago

This is caused by the DomParserparseFromString` function. The meta tags returned by the studio cli are self-closing. When parsed in the editor, this leads to a weird structure because it attempts to close the tag that is already closed. Fix need to happen in the Studio code, not here

deltork commented 1 month ago

issue opened in Studio https://github.com/ReadAlongs/Studio/issues/245

deltork commented 1 month ago

This section of code https://github.com/ReadAlongs/Studio-Web/blob/main/packages/studio-web/src/app/editor/editor.component.ts#L262-L273 injects the body tag into the element. Since adding meta tags, it has been injecting the body tag into the meta tag. I will add constraints to this section.

deltork commented 1 month ago

I need to add optional xmlns attribute to the read-along DTD

deltork commented 1 month ago

Now getting error "detail": "ReadAlong provided is not valid: <string>:6:0:ERROR:VALID:DTD_UNKNOWN_ATTRIBUTE: No declaration for attribute arpabet of element w this is data being sent `

        <div type="page" id="t0b0d0">
            <p id="t0b0d0p0">
                <s id="t0b0d0p0s0"><w id="t0b0d0p0s0w0" arpabet="HH EY L L OW" time="0.66" dur="0.41">Hello</w></s>
            </p>
        </div>

</body></text>

`

Note that the ARPABET attribute is being converted to lowercase since HTML attributes are all lowercase; it is converting our format ARPABET to lowercase arpabet to make it HTML compliant.

I added code undo the change by the parser.

joanise commented 1 month ago

We should make our DTD case insensitive, accepting each attribute in upper or lower case indifferently. Is that possible?

joanise commented 1 month ago

Reverting main re-broke this. :(