Closed ghost closed 2 months ago
yes i have same problem
So, I encountered the same problem. The solution is to pass "strip_whitespace=False" as an optional argument to xmltodict.parse(). So, for the above example, this should do the trick:
import xmltodict
xml = """
<parent>
<element><![CDATA[data ]]></element>
</parent>
"""
parsed_xml = xmltodict.parse(xml, strip_whitespace=False)
print(repr(parsed_xml['parent']['element']))
I discovered this after turning on debugging mode and stepping through the code. It would be nice if xmltodict's user documentation was more robust, so users don't have dig into the code to investigate this in the first place.
for example this xml
let's parse it
result:
'data'
expected result:
'data '
untangle library is able to correctly parse it: https://pypi.org/project/untangle/
result
'data '