metafacture / metafacture-core

Core package of the Metafacture tool suite for metadata processing.
https://metafacture.org
Apache License 2.0
69 stars 34 forks source link

Strange behaviour with indentation #525

Closed TobiasNx closed 1 month ago

TobiasNx commented 2 months ago

When running the following flux:

https://gitlab.com/oersi/oersi-marc/-/blob/2-createBasicTransformation/localTestWorkflow.flux?ref_type=heads

There are more indentations added to every new record in a file. The first (0) has a normal indentation :

<marc:collection xmlns:marc="http://www.loc.gov/MARC21/slim" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd">
    <marc:record>
        <marc:leader>00000cam a2200000  4500</marc:leader>
        <marc:controlfield tag="005">20240412102249.3</marc:controlfield>
        <marc:controlfield tag="007">cr</marc:controlfield>
        <marc:datafield tag="264" ind1="#" ind2="0">
        ...
    </marc:record>

</marc:collection>

the second has:

<marc:collection xmlns:marc="http://www.loc.gov/MARC21/slim" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd">
        <marc:record>
            <marc:leader>00000cam a2200000  4500</marc:leader>
            <marc:controlfield tag="005">20240412102249.4</marc:controlfield>
            <marc:controlfield tag="007">cr</marc:controlfield>
            <marc:controlfield tag="008">240228t||||||||xx#|||||o|||||||||||fin||</marc:controlfield>
            ...
        </marc:record>

</marc:collection>

record 14(15) has a lot:

<?xml version="1.0" encoding="UTF-8"?>
<marc:collection xmlns:marc="http://www.loc.gov/MARC21/slim" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd">
                                                            <marc:record>
                                                                <marc:leader>00000cam a2200000  4500</marc:leader>
                                                                <marc:controlfield tag="005">20240412102249.6</marc:controlfield>
                                                                <marc:controlfield tag="007">cr</marc:controlfield>
                                                                <marc:controlfield tag="008">240228t||||||||xx#|||||o|||||||||||fin||</marc:controlfield>
                                                                <marc:datafield tag="040" ind1="#" ind2="#">
                                                                    <marc:subfield code="b">fin</marc:subfield>
....
                                                            </marc:record>

</marc:collection>

Every record additional indentations are added. I am not sure why.

TobiasNx commented 2 months ago

This example cannot be run in the playground, but I have added a simple example for the runner here, that also shows the wrong indentation that it produces. This error seems to be specific for the combinantion batch-reset and encode-marcxml, when I use encode-yaml or encode-xml the error is not happening:

https://github.com/TobiasNx/notWorkingFlux/commit/9321111c22461af3f536cae485b6ad55d980148a

TobiasNx commented 2 months ago

@dr0i you wanted a playground example, see comment above this is not possible, but I added a small example in a repo and it shows the strange behaviour.