martinblech / xmltodict

Python module that makes working with XML feel like you are working with JSON
MIT License
5.47k stars 463 forks source link

AttributeError: 'int' object has no attribute 'startswith' #249

Open bindreams opened 4 years ago

bindreams commented 4 years ago

When unparsing data from a dict, xmltodict raises an exception when encountering a non-string key:

import xmltodict as xd

print(xd.unparse({
    "root": {
        1: 1
    }
}))

The following error occurs:

Traceback (most recent call last):
  File "bug.py", line 3, in <module>
    print(xd.unparse({
  File "C:\Users\andreasxp\Miniconda3\envs\nfb_studio\lib\site-packages\xmltodict.py", line 450, in unparse
    _emit(key, value, content_handler, full_document=full_document,
  File "C:\Users\andreasxp\Miniconda3\envs\nfb_studio\lib\site-packages\xmltodict.py", line 388, in _emit
    if ik.startswith(attr_prefix):
AttributeError: 'int' object has no attribute 'startswith'

It seems to me that happens because of the key in the nested dict being an integer.

MuqadderIqbal commented 4 years ago

You are correct. It does assume that all dict keys will be strings. The "startswith" key word in the resulting error message is a dead giveaway. Also, it works if I do this:

d ={"root" : {"1": 1}} 
print(xmltodict.unparse(d, pretty = True))

Output:

<?xml version="1.0" encoding="utf-8"?>
<root>
    <1>1</1>
</root>
bindreams commented 4 years ago

It seems reasonable to me that xmltodict should convert non-string keys into strings automatically (like the json module does), or at least raise a proper exception to the user.

MuqadderIqbal commented 4 years ago

that's reasonable ask with regards to this specific issue. However, I have a feeling such auto-conversion might introduce more fundamental issues when dealing with dicts that are supposed to contain non-string keys (since Python allows that).

For example:

d = {1 : "hello world", "2" : 5060}
for k in d.keys():
 print(type(k))

<class 'int'> <class 'str'>

Now if my code had specific logic around the type of these keys, auto-conversion will not only break that, it will also limit the basic operations that can be done with such multi-type dicts. APIs are a typical example of such use cases.

It seems to me this will most likely require a broader discussion around how this package can/should address this specific scenario. I'll let the experts make the call on this one :)

Stuart-Yee commented 3 weeks ago

I'm encountering this as well. I also would've expected integer keys to be converted to strings but @MuqadderIqbal raises valid points.

I wonder if it would be possible to add a parameter to specify whether or not the dict parsing should automatically convert to string or not.

--EDIT--

I take it back, it would probably lead to creation of invalid xml tags.

But what would be helpful is if the error handling displayed the key, value pair causing the exception in the stacktrace.