NaturalIntelligence / fast-xml-parser

Validate XML, Parse XML and Build XML rapidly without C/C++ based libraries and no callback.
https://naturalintelligence.github.io/fast-xml-parser/
MIT License
2.59k stars 309 forks source link

Feature Request: option to ignore attributes by name (by array of string | regex) #666

Open mav-rik opened 3 months ago

mav-rik commented 3 months ago

Description

Please consider adding an option to the parser/builder to exclude certain attributes by name. This could be implemented as an array of attribute names or a RegExp pattern, similar to existing include/exclude API patters for paths (e.g. typescript.json > exclude).

Use Case

We need to compare two XML documents by parsing and converting them into normalized XML outputs. Sometimes, one XML might contain extra attributes that are irrelevant for our comparison. If we exclude specific attributes during parsing, we'll easily compare the results then.

Suggested API:

const parser = new XMLParser({
  ignoreAttributes: false,
  excludeAttributes: [
    'attr-to-skip',
    /^ns:/,
    // more patterns or attributes here...
  ]
});

Expected Result:

The parser should ignore the specified attributes, ensuring they do not appear in the resulting XML tree.

github-actions[bot] commented 3 months ago

We're glad you find this project helpful. We'll try to address this issue ASAP. You can vist https://solothought.com to know recent features. Don't forget to star this repo.

amitguptagwl commented 3 months ago

this is a good suggestion. We can probably replace ignoreAttributes with this new property.

mav-rik commented 3 months ago

this is a good suggestion. We can probably replace ignoreAttributes with this new property.

Do you need any help with PR? I can work on ignoreAttributes to accept an array of regex/strings and send a PR

amitguptagwl commented 3 months ago

Thanks @mav-rik . I'm busy with a new open source project. So it'll be helpful if you can raise a PR.

I'm just thinking from release perspective if we support both properties for sometime with deprecation warning otherwise it'll be a breaking change.

mav-rik commented 3 months ago

I'm just thinking from release perspective if we support both properties for sometime with deprecation warning otherwise it'll be a breaking change.

Why don't just keep backwards compatible ignoreAttributes prop like this:

ignoreAttributes: boolean | (string | Regex)[]

So when it is true or false - the current logic works. And when it is an array of strings/regexp - then we parse attrs and filter those out based on the list.

In this case it won't be a breaking change.

mav-rik commented 2 months ago

@amitguptagwl hey can you please take a look PR #668

Codeclimate fails with error "Avoid deeply nested control flow statements". The thing is I didn't add nor remove any of the if/else statements, I only added a few commands inside. In order to resolve Codeclimate error I'll have to refactor the whole thing which I'd like to avoid.

amitguptagwl commented 2 months ago

Thanks for the PR @mav-rik. It looks good to me. However, I still need to check thoroughly.

jcable commented 3 weeks ago

Hi, this library is working really well for me, but I had this in my config:

ignoreAttributes: ['id'],

And its working fine in unit tests on my mac, but it behaves as though it were:

ignoreAttributes: true,

in AWS lambda. fast-xml-parser 4.5.0.

amitguptagwl commented 3 weeks ago

Sorry @jcable , I couldn't understand your concern.