Leonidas-from-XIV / node-xml2js

XML to JavaScript object converter.
MIT License
4.87k stars 601 forks source link

Unicode character inside attributes. #165

Open hugowschneider opened 9 years ago

hugowschneider commented 9 years ago

Hi,

Every time I have a new line, a tab or some other special characters inside an attribute value, the xml builder creates a xml string with html entities like and . How can I disable it and get new lines and tabs in my attribute value?

My original xml block:

<beans xmlns="http://www.springframework.org/schema/beans"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:context="http://www.springframework.org/schema/context"
    xmlns:jee="http://www.springframework.org/schema/jee" xmlns:tx="http://www.springframework.org/schema/tx"
    xmlns:security="http://www.springframework.org/schema/security"
    xsi:schemaLocation="http://www.springframework.org/schema/beans
         http://www.springframework.org/schema/beans/spring-beans.xsd
         http://www.springframework.org/schema/context
         http://www.springframework.org/schema/context/spring-context.xsd
         http://www.springframework.org/schema/jee
         http://www.springframework.org/schema/jee/spring-jee.xsd
         http://www.springframework.org/schema/tx 
         http://www.springframework.org/schema/tx/spring-tx.xsd
         http://www.springframework.org/schema/security
         http://www.springframework.org/schema/security/spring-security.xsd"></beans>

The result after parsing and building:


<beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:context="http://www.springframework.org/schema/context" xmlns:jee="http://www.springframework.org/schema/jee" xmlns:tx="http://www.springframework.org/schema/tx" xmlns:security="http://www.springframework.org/schema/security" xsi:schemaLocation="http://www.springframework.org/schema/beans&#xD;&#xA;         http://www.springframework.org/schema/beans/spring-beans.xsd&#xD;&#xA;         http://www.springframework.org/schema/context&#xD;&#xA;         http://www.springframework.org/schema/context/spring-context.xsd&#xD;&#xA;         http://www.springframework.org/schema/jee&#xD;&#xA;         http://www.springframework.org/schema/jee/spring-jee.xsd&#xD;&#xA;         http://www.springframework.org/schema/tx &#xD;&#xA;     &#x9; http://www.springframework.org/schema/tx/spring-tx.xsd&#xD;&#xA;     &#x9; http://www.springframework.org/schema/security&#xD;&#xA;     &#x9; http://www.springframework.org/schema/security/spring-security.xsd"></beans>
jameshowe commented 8 years ago

Has there been any progress on this? I am outputting some embedded HTML in an XML doc and would like to override the rendering behavior to output <br/> rather than &#xD;.

Looks like it's the xmlbuilder-js lib that does the escaping but just wondering if you have any pre/post-processing hooks I could leverage?

tflanagan commented 8 years ago

In your opinion, @jameshowe, what would the hook provide for functionality?

How would you interact with said hooks?

jameshowe commented 8 years ago

@tflanagan in my head I just see it as a way to get in between the processing to manipulate the output e.g.

new xml2js.Builder({
    renderOpts: {
        preRenderText: function(el, val) {
            if (...) {
                // el is descendant of html tag
                return val.replace(/\r/g, '<br/>');
            }
        }
    }
});

preRenderText would be called just before we call render (I don't know my way around the lib well enough to determine exactly where that would be). Passing in the actual element along with the value gives us the ability to do fine-grained overrides.

Alternatively, rather than having it set globally, we could set it at element level? e.g.

xml2js.parseString(defaultXml, function (err, result) {
    ...
    result["span"].renderText = function(val) {
        return val.replace(/\r/g, '<br/>');
    }
});

The builder would just have to check for the existence of the handler.

tflanagan commented 8 years ago

xml2js.parseString() already has the ability to hook into it via attrNameProcessors, tagNameProcessors, and valueProcessors. This is because the underlying SAX parser is event based, making hooks extremely easy.

Looking at the xmlbuilder-js module, it looks like this might take some work before getting it 100%.

jameshowe commented 8 years ago

@tflanagan if I'm reading the xmlbuilder-js code correctly (which I may not be as I don't use CoffeeScript!), it seems like they provide the ability to override the XMLStringifier methods in the constructor?

The blocker from xml2js is the fact we only pass specific options down. If, for arguments sake, we changed this code to:

  rootElement = builder.create(rootName, this.options.xmldec, this.options.doctype, {
      headless: this.options.headless,
      stringify: this.options.stringify
  });

We could pass stringify-specific options which should allow us to override the behaviour e.g.

new xml2js.Builder({
    stringify: {
        eleText: function(val) {
            ...
        }
    }
});

The downside here though is we can't be specific about which elements we want to manipulate as we don't have access to the element.

tflanagan commented 8 years ago

Exactly, before this can be implemented in the fashion you are requesting, a PR will need to be made for xmlbuilder-js. Otherwise, anything done in here is a workaround hack.

antonlvovych commented 8 years ago

+1

nigurr commented 8 years ago

Is there any progress on this? I am hitting the same issue.

Leonidas-from-XIV commented 8 years ago

I am not working on pushing such a PR to xmlbuilder-js, but I encourrage you to do so.