NaturalIntelligence / fast-xml-parser

Validate XML, Parse XML and Build XML rapidly without C/C++ based libraries and no callback.
https://naturalintelligence.github.io/fast-xml-parser/
MIT License
2.45k stars 296 forks source link

Add builder option to add space at end of self closing tag #560

Closed Radeonmann closed 1 year ago

Radeonmann commented 1 year ago

Description

For my use case I need to parse, modify and save XML files, which are usually created by another tool (IDE for PLC programming).

The XML files are part of the PLC project and usually under source control. Therefore, it is important that I can save the XML files in the exact same form as it was read in and written by the original IDE.

The IDE stores self closing tags with a space before the closing bracket. I saw a space before the closing bracket also in many other XML files, why I think it is not too uncommon.

<MyElement />

Input

I would like to add an option spaceOnSelfClosingTag to the builder, so this behavior can be configured.

Code

const builder = new XMLBuilder({
    suppressEmptyNode: true,
    ignoreAttributes: false,
    spaceOnSelfClosingTag: true, // New builder option
});
const output = builder.build({
    root: {
        ElementWithContent: "Some text",
        EmptyElement: {},
        OnlyAttributes: { "@_attr": "nice" },
    },
});

Output

Output with spaceOnSelfClosingTag: true

<root>
  <ElementWithContent>Some text</ElementWithContent>
  <EmptyElement />
  <OnlyAttributes attr="nice" />
</root>

Output with spaceOnSelfClosingTag: false (default)

<root>
  <ElementWithContent>Some text</ElementWithContent>
  <EmptyElement/>
  <OnlyAttributes attr="nice"/>
</root>

Would you like to work on this issue?

I already did a commit in my fork and only the test cases and benchmarks are remaining. If the issue is accepted, I will do the remaining work and make a pull request.

github-actions[bot] commented 1 year ago

I'm glad you find this repository helpful. I'll try to address your issue ASAP. You can watch the repo for new changes or star it.

amitguptagwl commented 1 year ago

This seems more specific option. And I generally prefer generalized approach which can be used for multiple purpose. So let's think about some approach

Radeonmann commented 1 year ago

First thanks for your response and sorry for my late answer. I got somehow busy in the last days. I agree with you, that the option solves only this specific use case, but the use case seems to be quite common.

Open to see space on empty tag samples The origin seems to be in the past of XHTML and very very old browsers. According to https://stackoverflow.com/a/462997 it is also within the XHTML specs https://www.w3.org/TR/xhtml1/#C_2 The style is default in: - Eclipse XML formatting (acording to [this](https://stackoverflow.com/questions/13496837/xml-formatting-conventions-why-leave-a-space-before)) - Default output format in .NET. You can try the provided sample in the [online playground](https://dotnetfiddle.net) - Default format of the [RedHat VS Code XML extension](https://marketplace.visualstudio.com/items?itemName=redhat.vscode-xml) You can also find this style in many tutorials (e.g. W3C) ... #### C# Sample: ```CSharp using System; using System.Xml; var doc = new XmlDocument(); doc.LoadXml(""" """); var writer = new XmlTextWriter(Console.Out); doc.WriteTo(writer); ``` Leads to output: ```XML ```

More general

I would still lean towards a simple solution for this not too uncommon use case. But of course it is your project and I will follow whatever is your intended direction.

What kind of more general solution did you think about?

More general could be multiple ways, some of which I came up with now are below. But maybe it is completely different from what you thought... 🙃

String option

Just a string option instead of a boolean (option name is not nice yet). This would give a bit more generalization but would still be easy to understand for a user.

const builder = new XMLBuilder({
  suppressEmptyNode: true,
  ignoreAttributes: false,
  selfClosingTagEndSpace: "    ", // New builder option
});
<root>
  <ElementWithContent>Some text</ElementWithContent>
  <EmptyElement    />
  <OnlyAttributes attr="nice"    />
</root>

Callback only for tag end

A callback only for the tag end space. This woule be flexible, but only for the tag end.

function makeTagEndSpace(
  elementIsEmpty: boolean, // true if the element has no inner content (except attributes)
  hasAttributes: boolean // true if the element has at least one attribute
  //... maybe more data?
)
{
  return elementIsEmpty  ? " "
         : hasAttributes ? "   "
                         : "";
}

const builder = new XMLBuilder({
  suppressEmptyNode: true,
  ignoreAttributes: false,
  tagEndSpace: makeTagEndSpace, // New builder option
});

The sample XML would be

<root>
  <emptyEle />
  <emptyEleWithAttr attr="val" />
  <ele>
    Text
  </ele>
  <eleWithAttr attr="val"   >
    Text
  </ele>
</root>

Callback for each attribute:

The callback could be called once for each attribute and would return the separator string between the current and the last attribute. It could be called once also for elements without attributes.

function makeAttributeSeparator(
  elementIsEmpty: boolean, // true if the element has no inner content (except attributes)
  isFirstAttribute: boolean, // true for the first attribute
  isLastAttribute: boolean, // true for the last attribute
  isTagEnd: boolean, // true for the very last call
  actIndent: string, // current indentation
  indentBy: string // indent by from options (not mandatory, as externally available)
) {
  if (elementIsEmpty) {
    return isFirstAttribute  ? "    "
           : isLastAttribute ? "   "
           : isTagEnd        ? "  "
                             : " ";
  } else {
    return isTagEnd ? "" : " ";
  }
}

const builder = new XMLBuilder({
  suppressEmptyNode: true,
  ignoreAttributes: false,
  attributeSeparator: makeAttributeSeparator, // New builder option
});

An example output of this would be:

<root>
  <emptyNoAttr  />
  <emptyMultiAttr    first="1" second="2" third="3"   fourth="4"  >
  <withContent>
    Hello
  </withContent>
  <withContentAndAttr first="1" second="2" third="3">
    Hello
  </withContentAndAttr>
</root>

With this soultion it would be possible to solve many attribute format use cases, such as e.g.

<root>
  <manyAttributes
    attr1="one"
    attr2="two"
    attr3="three"
    attr4="four"
  />
</root>

A disadvantage could be maybe the overhead / performance of the many function calls and argument evaluations.

Callback for whole 'inner' tag content

The callback would be called once for each element and would generate the whole inner contents between the element name and the closing tag.

<root>
  <elem{all-this-data}/>
</root>

It could look like this:

function makeInnerContents(
  elementIsEmpty: boolean, // true if the element has no inner content (except attributes)
  attributes: Object, // all attributes, maybe as name / value object?
  actIndent: string, // current indentation
  indentBy: string // indent by from options (not mandatory, as externally available)
) {
  let innerContent = "";
  for (const [attrName, attrValue] of Object.entries(attributes)) {
    innerContent += ` ${attrName}="${attrValue}"`
  }
  if (elementIsEmpty) { innerContent += " "; }
  return innerContent;
}

const builder = new XMLBuilder({
  suppressEmptyNode: true,
  ignoreAttributes: false,
  innerContentBuilder: makeInnerContents, // New builder option
});

A big disadvantage is, that the user would be responsible himself for proper XML attribute formatting. At the same time it would give very high flexibility, so e.g. following XML would be possible:

<root>
  <ele attr
           = "value"
  />
</root>
amitguptagwl commented 1 year ago

That's really in detail. What's about updateTag function? Same we have in parser. This can return the tag name of users choice. User can even add empty space in last. However, in this case we should pass some parameters like isLeafeNode, isEmptyNode, jPath etc. So user can take better decision.

amitguptagwl commented 1 year ago

I recently added updateTag option. Can you please check if this cover the feature you want to provide? Here, you can find more detail.

amitguptagwl commented 1 year ago

I believe this issue is resolve. But feel free to reopen it if something is not fixed.