metafacture / metafacture-core

Core package of the Metafacture tool suite for metadata processing.
https://metafacture.org
Apache License 2.0
69 stars 34 forks source link

Namespace-prefixes of elements and attributes in XML input files are deleted #377

Closed TobiasNx closed 3 years ago

TobiasNx commented 3 years ago

If decode XML it seems that namespaces in element and attribute names are deleted:

            <mods:mods ID="duepublico_mods_00074526">
                <mods:name type="personal" xlink:type="simple">
                    <mods:displayForm>Armbruster, André</mods:displayForm>
                    <mods:role>
                        <mods:roleTerm authority="marcrelator" type="code">aut</mods:roleTerm>
                        <mods:roleTerm authority="marcrelator" type="text">Author</mods:roleTerm>
                    </mods:role>
                    <mods:nameIdentifier type="gnd">1081830107</mods:nameIdentifier>
                    <mods:namePart type="family">Armbruster</mods:namePart>
                    <mods:namePart type="given">André</mods:namePart>
                </mods:name>

FLUX:

default infile = FLUX_DIR + "mods.xml";

infile
| open-file
| decode-xml
| handle-generic-xml
| encode-xml // (or: encode-json)
| write(FLUX_DIR + "result.xml")
;

encoded as XML

                <name>
                    <type>personal</type>
                    <type>simple</type>
                    <displayForm>
                        <value>Armbruster, André</value>
                    </displayForm>
                    <role>
                        <roleTerm>
                            <authority>marcrelator</authority>
                            <type>code</type>
                            <value>aut</value>
                        </roleTerm>
                        <roleTerm>
                            <authority>marcrelator</authority>
                            <type>text</type>
                            <value>Author</value>
                        </roleTerm>
                    </role>
                    <nameIdentifier>
                        <type>gnd</type>
                        <value>1081830107</value>
                    </nameIdentifier>
                    <namePart>
                        <type>family</type>
                        <value>Armbruster</value>
                    </namePart>
                    <namePart>
                        <type>given</type>
                        <value>André</value>
                    </namePart>
                </name>

or encoded as JSON

            "name": {
                "type": "personal",
                "type": "simple",
                "displayForm": {
                    "value": "Armbruster, André"
                },
                "role": {
                    "roleTerm": {
                        "authority": "marcrelator",
                        "type": "code",
                        "value": "aut"
                    },
                    "roleTerm": {
                        "authority": "marcrelator",
                        "type": "text",
                        "value": "Author"
                    }
                },
                "nameIdentifier": {
                    "type": "gnd",
                    "value": "1081830107"
                },
                "namePart": {
                    "type": "family",
                    "value": "Armbruster"
                },
                "namePart": {
                    "type": "given",
                    "value": "André"
                }
            },
TobiasNx commented 3 years ago

This is also not documented.

Also the deletion of the namespace prefix creates the problem that you cannot distinguish between:

<mods:name type="personal" xlink:type="simple">

both end up beeing named type

dr0i commented 3 years ago

I set up a branch where your input should now result in an output preserving namespaces, like:

                                <mods:name>
                                        <type>personal</type>
                                        <xlink:type>simple</xlink:type>
                                        <mods:displayForm>
                                                <value>Armbruster, André</value>
                                        </mods:displayForm>
                                  </mods:name>
dr0i commented 3 years ago

In the flux you would add the parameter like handle-generic-xml(emitnamespace="true").

TobiasNx commented 3 years ago

Seems to work: https://github.com/TobiasNx/notWorkingFlux/blob/main/modsAttributes/result.xml

+1