martinblech / xmltodict

Python module that makes working with XML feel like you are working with JSON
MIT License
5.45k stars 465 forks source link

Exclude tags #5

Closed NikolayGalkin closed 11 years ago

NikolayGalkin commented 11 years ago

Hi,

Could you help me. Are there any ways to exclude some tags from parsing. For example:

"< article> < title>Here is a title< /title> < description>here is a big description< element >Title< /element >< /description> < /article>"

I need to get description without parsing: article: {"title": "Here is a title", "description: "here is a big description< element>Title< /element>"}

Thanks.

martinblech commented 11 years ago

xmltodict is based on the built-in xml.parsers.expat module. There's no way that I know of to stop expat from parsing a subtree and treat it as CDATA instead, but you could investigate further and come back with a pull request ;)

slestak commented 11 years ago

What about a callback to do some post processing on the resulting dict? For simplicity, let it parse normally, then transform the dictionary as needed. On Nov 6, 2012 1:05 PM, "Martín Blech" notifications@github.com wrote:

xmltodict is based on the built-in xml.parsers.expat module. There's no way that I know of to stop expat from parsing a subtree and treat it as CDATA instead, but you could investigate further and come back with a pull request ;)

— Reply to this email directly or view it on GitHubhttps://github.com/martinblech/xmltodict/issues/5#issuecomment-10120792.

martinblech commented 11 years ago

@slestak It's not a perfect solution for this scenario, as it would imply an unwanted and in some cases lossy XML->dict->XML transformation. It is, however, a good idea for a useful general-purpose feature. I have forked this into a new issue here and will work on it.