tuananh / camaro

camaro is an utility to transform XML to JSON, using Node.js binding to native XML parser pugixml, one of the fastest XML parser around.
MIT License
557 stars 29 forks source link

[question] how to set default value if a path value is empty #78

Closed aaronjfox closed 5 years ago

aaronjfox commented 5 years ago

Is there a straight forward way of returning a specified value if a path defined in the template is empty? I am using the package in an AWS Lambda function to parse XML docs to JSON, then upload that into DynamoDB. The issue I am having with camaro returning blank if a path is empty is DynamoDB doesn't allow an attribute value to contain an empty string. Currently I am looping through the object returned by transform to check for empty strings, but hoping there is a better way I may be missing. I am admittedly not well versed in XPath, but since 'if a path is empty, return blank' is listed as custom syntax I figured it best to ask here first.

Thanks in advance and thanks for a great package.

tuananh commented 5 years ago

you mean if the value is blank, return a default value?

aaronjfox commented 5 years ago

Yes. Either a default value for any empty element, or explicitly defining a default value for each element in the template if it does not return a value after parsing. Essentially:

xml = <someParent> <someChild>Hello World</someChild> <anotherChild>Hello Again</anotherChild> </someParent>

template = { parent: ['//someParent', { sChild: 'someChild', aChild: 'anotherChild', rChild: 'randomChild' }]}

output = { "parent": [ "sChild": "Hello World", "aChild": "Hello Again", "rChild": "" <------ Is there a way to keep this path from being an empty string if not found? ] }

Edit: Sorry for the formatting. Hopefully it makes sense, if not I can fix it later!

tuananh commented 5 years ago

there is a work around this using if else condition but it's very ugly

sth like this

const { transform } = require('camaro')

const xml = `
    <items>
        <item>hello</item>
        <item />
        <item>world</item>
    </items>
`

const template = {
    items: ['//items/item', `
        concat(
            text(),
            substring(
                "default val", 
                1, 
                number(not(text())) * string-length("default val")
            )
        )
    `]
}

;(async function main() {
    const output = await transform(xml, template)
    console.log(JSON.stringify(output, null, 4))
})();

output

{
    "items": [
        "hello",
        "default val",
        "world"
    ]
}

explanation: use text() value and then concat with the other default value if the condition is true.

so if text() has value => the condition is false => text() + empty => text() still

if text() is empty => the condition is true => empty + default value = default value

or if you use this in lots of place and to make it looks less ugly, try this

const textOrDefault = (defaultValue) => `concat(
    text(),
    substring(
        "${defaultValue}", 
        1, 
        number(not(text())) * string-length("${defaultValue}")
    )
)`
const template = {
    items: ['//items/item', textOrDefault('my default val')]
}
tuananh commented 5 years ago

the if a path is empty, return blank is just that if you use a empty string in template, the output will be blank. that's it. Mostly useful in case you want to fill them in with empty string instead of leaving them undefined.

aaronjfox commented 5 years ago

That makes sense. I'll be able to put that workaround to use. Thanks for the help.

eeveefox commented 4 years ago

Hello. how i can do someone like that, but with attribute. `

`

if i use it with start: ['insert/dbCommand/start_ute_command/ute_command/@call', textOrDefault('false')], i alwayse have 'false'

tuananh commented 4 years ago

@eeveefox it seems right. can you post the whole xml and your code snippet so that i can try

eeveefox commented 4 years ago
<root>
    <!--  Тестовые команды для семантических данных  -->
    <data table="TEST_LAYER_SEM" schema="TEST" id="TEST_LAYER_SEM" comment="TEST_LAYER_SEM">
        <select>
            <dbQuery idField="ID">
                <var name="FILTER" default="1=1"/>
                <start_ute_command>
                    <ute_command call="TEST.xml#TEST_PREV_COMMAND" />
                </start_ute_command>
                <query>
                    SELECT d.doc_version_id, d.doc_id id, d.next_doc_version_id, d.mime_type_id, d.storage_id, d.name,
                    d.descr, d.version_number, d.body, d.file_name, d.ctime, d.cuser_id, d.mtime, d.muser_id
                    FROM  web50.lib_doc_version_node d WHERE {FILTER} LIMIT 2
                </query>
            </dbQuery>
        </select>
    </data>
</root>

and


const recipeTemplate = {
            root: ["/root/data", {
                id: "@id",
                select: {
                    query: "select/dbQuery/query",
                    vars: ["select/dbQuery/var", {
                        name: "@name",
                        type: "@type",
                        default: "@default",
                        direction: "@direction"
                    }],
                    params: ["select/dbQuery/param",{
                        name: "@name",
                        type: "@type",
                        default: "@default"
                    }],
                    ute_commands: {
                        start: ['select/dbQuery/start_ute_command/ute_command/@call', textOrDefault('false')],
                        end: "select/dbQuery/end_ute_command/ute_command/@call"
                    }
                }
             }
            ]
        };
tuananh commented 4 years ago

@eeveefox ah because the textOrdefautl value only give you output of text() xpath function

notice what i change in the textOrDefault function and the template below.

you can try this

const { transform } = require('.')
const fs = require('fs')

;(async function main() {
    const textOrDefault = (defaultValue) => `concat(
        @call,
        substring(
            "${defaultValue}", 
            1, 
            number(not(@call)) * string-length("${defaultValue}")
        )
    )`

    const recipeTemplate = {
        root: ["/root/data", {
            id: "@id",
            select: {
                query: "select/dbQuery/query",
                vars: ["select/dbQuery/var", {
                    name: "@name",
                    type: "@type",
                    default: "@default",
                    direction: "@direction"
                }],
                params: ["select/dbQuery/param",{
                    name: "@name",
                    type: "@type",
                    default: "@default"
                }],
                ute_commands: {
                    start: ['select/dbQuery/start_ute_command/ute_command', textOrDefault('false')],
                    end: "select/dbQuery/end_ute_command/ute_command/@call"
                }
            }
         }
        ]
    };

    const xml = fs.readFileSync('input.xml', 'utf-8')
    const output = await transform(xml, recipeTemplate)
    console.log(JSON.stringify(output, null, 4));

})()

output

{
    "root": [
        {
            "id": "TEST_LAYER_SEM",
            "select": {
                "query": "\n                    SELECT d.doc_version_id, d.doc_id id, d.next_doc_version_id, d.mime_type_id, d.storage_id, d.name,\n                    d.descr, d.version_number, d.body, d.file_name, d.ctime, d.cuser_id, d.mtime, d.muser_id\n                    FROM  web50.lib_doc_version_node d WHERE {FILTER} LIMIT 2\n                ",
                "vars": [
                    {
                        "name": "FILTER",
                        "type": "",
                        "default": "1=1",
                        "direction": ""
                    }
                ],
                "params": [],
                "ute_commands": {
                    "start": [
                        "TEST.xml#TEST_PREV_COMMAND"
                    ],
                    "end": ""
                }
            }
        }
    ]
}
eeveefox commented 4 years ago

Thank you, I didn't figure it out myself. It's the right thing to do.