Closed thebetar closed 1 month ago
We also ran into this issue, git bisect pointed us to this commit: https://github.com/martinblech/xmltodict/commit/c9f1143694c52666818715e865a56ffc46d9232e
c9f1143694c52666818715e865a56ffc46d9232e is the first bad commit
commit c9f1143694c52666818715e865a56ffc46d9232e
Author: Trey Franklin <trey.franklin@adtran.com>
Date: Mon Feb 8 11:39:45 2021 -0600
Ensure significant whitespace is not trimmed
tests/test_xmltodict.py | 4 ++--
xmltodict.py | 2 +-
2 files changed, 3 insertions(+), 3 deletions(-)
Same issue here, some of our tests start to fail since xmltodict v0.14.0
due to the issue mentioned above.
Same issue here. For context this update seems to be tied to:
In our case when we had an empty xml element with only whitespace we expected it to be trimmed to None
and it no longer is. It's a breaking change in behaviour that existing codebases have had to work with in previous versions (rightly or wrongly). In our case we just had to change our expectation, but would be nice if there were still an option to ignore empty xml elements that only contain whitespace.
I have made a PR with a suggestion to resolve this issue while still keeping the change of the original PR https://github.com/martinblech/xmltodict/pull/362
This is a breaking change for us as well. I also agree that the choice to retain whitespace or not is a context specific question and should be a parameter. I'd also suggest that with a version change giving no clues, .13 to .14, to retain the old behavior and enable preservation of whitespace via the option.
Reverted the changed and pushed v0.14.2
@martinblech so is the original PR which changed the whitespace logic not the way it was intended to be?
Thanks for doing this, I'd agree that the change was a regression. As for whitespace, we just need to think back to the days when XML was really adopted. Apologies to those that already know all this! It's main reason for existing came out of the document world and when we wanted more structured documents. XHTML was a huge part of the design considerations and preserving whitespace is essential for those kinds of uses. XML's use for plain data was part of the designs and goals too and stripping whitespace really helps with that. Today, document oriented XML is probably a minor footnote, but not gone. Taking the PR to add an option to retain whitespace seems like a great next step. And would be backwards compatible.
@thebetar it was a regression because it changed the default behavior into something that most users don't want.
I was running a pipeline this morning and tests were failing all of a sudden. After doing some debugging we found out it was because xmltodict was updated and was causing errors in our unit tests. The error was caused by our previous expected output file expecting trimmed strings like
which first succeeded but after this update the new result from the parsing xml to a dictionary was
I will see if I have time this weekend to try and create a PR for this issue.