martinblech / xmltodict

Python module that makes working with XML feel like you are working with JSON
MIT License
5.49k stars 462 forks source link

Ensure significant whitespace is not trimmed #267

Closed thehoustonian closed 2 years ago

thehoustonian commented 3 years ago

This PR should address #264

I think xmltodict has been improperly handling significant whitespace by default when strip_whitespace=True. If the whitespace is part of a leaf node, it should be preserved, whereas whitespace elsewhere in the document can be safely removed. I also adjusted some of the existing tests to reflect this change. This page illustrates the difference.

martinblech commented 2 years ago

Thanks!

thebetar commented 1 week ago

Would it not be more logical that the strip_whitespace property also removes whitespace in the node itself, if not I think it would be nice for some property (maybe defaulted to False if you think the normal behaviour is not stripping whitespace in a node) to exist to do this. I have created an issue where someone linked to this PR and I understand the change but I think other users of this package also expect whitespace to be trimmed, for instance if the XML document is received in a formatted manner it will contain whitespace and new line characters.