kbr / fritzconnection

Python-Tool to communicate with the AVM Fritz!Box by the TR-064 protocol and the AHA-HTTP-Interface
MIT License
304 stars 59 forks source link

call_action and Umlauts in arguments - utf-8 documentation #59

Closed bufemc closed 3 years ago

bufemc commented 3 years ago

I am not sure if this problem is because of myself, I even debugged your code, but still I am lost what is the problem. In short the problem: although it seems everything to be utf-8, even the name 'CallBlockerTest-Ümlaut' itself, it will not work because and only because of the ü (if I remove it, all works fine). I post the stacktrace below. I also searched in the official documentation but did find nothing about how to handle Umlaute/Umlauts correctly.. fc is fritzconnection in my case..


        arg = {'NewPhonebookID': pb_id,
               'NewPhonebookEntryID': '',
               'NewPhonebookEntryData':
                   f'<?xml version="1.0" encoding="utf-8"?>'
                   f'<Envelope xmlns:s="http://www.w3.org/2003/05/soap-envelope" s:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">'
                   f'<contact>'
                   f'<category>0</category>'
                   f'<person><realName>{name}</realName></person>'
                   f'<telephony nid="1"><number type="home" prio="1" id="0">{number}</number></telephony>'
                   f'</contact>'
                   f'</Envelope>'}

        return self.fc.call_action('X_AVM-DE_OnTel:1', 'SetPhonebookEntry', arguments=arg)
Traceback (most recent call last):
  File "C:/workspace/python/a1-fritzbox/a1fritzbox/phonebook.py", line 146, in <module>
    result = pb.add_contact(2, 'CallBlockerTest-Ümlaut', '009912345', skip_existing=False)
  File "C:/workspace/python/a1-fritzbox/a1fritzbox/phonebook.py", line 81, in add_contact
    return self.fc.call_action('X_AVM-DE_OnTel:1', 'SetPhonebookEntry', arguments=arg)
  File "C:\Python3\lib\site-packages\fritzconnection\core\fritzconnection.py", line 215, in call_action
    return self.soaper.execute(service, action_name, arguments)
  File "C:\Python3\lib\site-packages\fritzconnection\core\soaper.py", line 194, in execute
    return handle_response(response)
  File "C:\Python3\lib\site-packages\fritzconnection\core\soaper.py", line 176, in handle_response
    raise_fritzconnection_error(response)
  File "C:\Python3\lib\site-packages\fritzconnection\core\soaper.py", line 105, in raise_fritzconnection_error
    raise exception(message)
fritzconnection.core.exceptions.FritzConnectionException: UPnPError: 
errorCode: 502
errorDescription: XML error

BTW, if I use &uuml; instead of ü it will work and arrive as ü in the phonebook. But TBH I don't want to create html entities, or do I have to? Although everything is set to utf-8, even the content-type header? Is it something AVM Fritzbox specific?

bufemc commented 3 years ago

When debugging I get this: I wonder if it's the debugger view or if there is really no space between s:encodingStyle and the followed xmlns:s - I see both concatenated all the time in the debugger view, even if just reading from phonebook 0 etc.

s:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"xmlns:s="http://schemas.xmlsoap.org/soap/envelope/"

<?xml version="1.0" encoding="utf-8"?><s:Envelope s:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"xmlns:s="http://schemas.xmlsoap.org/soap/envelope/"><s:Body><u:SetPhonebookEntry xmlns:u="urn:dslforum-org:service:X_AVM-DE_OnTel:1"><s:NewPhonebookID>2</s:NewPhonebookID><s:NewPhonebookEntryID></s:NewPhonebookEntryID><s:NewPhonebookEntryData><?xml version="1.0" encoding="utf-8"?><Envelope xmlns:s="http://www.w3.org/2003/05/soap-envelope" s:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"><contact><category>0</category><person><realName>CallBlockerTest-Ümlaut</realName></person><telephony nid="1"><number type="home" prio="1" id="0">009912345</number></telephony></contact></Envelope></s:NewPhonebookEntryData></u:SetPhonebookEntry></s:Body></s:Envelope>

Update for @kbr: I even did a pprint of the envelope: I always get the both above CONCATENATED to each other, no space between. Smells like a bug, hope I am not the culprit somehow ,-) Auch an anderer Stelle seh ich das immer "zusammenkleben":

<?xml version="1.0" encoding="utf-8"?><s:Envelope s:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"xmlns:s="http://schemas.xmlsoap.org/soap/envelope/"><s:Body><u:GetPhonebook xmlns:u="urn:dslforum-org:service:X_AVM-DE_OnTel:1"><s:NewPhonebookId>0</s:NewPhonebookId></u:GetPhonebook></s:Body></s:Envelope>

Das Seltsame ist nur, der Fritzbox ist das egal, die frisst das ohne Beschwerden. Doch sobald man wieder ein (ich kaufe ein) ü schickt, ist wieder 502..

kbr commented 3 years ago

It reports as XML error but I suppose it's an encoding error. Did you encode the name with the umlauts to utf-8 before doing the substitution in the string given to 'NewPhonebookEntryData'?

bufemc commented 3 years ago

I pass a normal string, I tried it as "Düsseldorf", but to be sure also as u"Düsseldorf". I guess strings are by default in utf-8 as stated here: "While UTF-8 is used (on Python 3) as the default source code encoding" and here: "By default, Python uses utf-8 encoding." I am still believing normal strings are encoded as utf-8 (although I am using W10).

I am a little bit confused. Should I do a string encode to utf-8 and then bypass e.g. b'D\xc3\xbcsseldorf' to the XML [maybe even as byte array or whatever]? TBH, I already did, of course I had then b'D\xc3\xbcsseldorf' in my phonebook, and not Düsseldorf - at least it worked ;)

I tried so many things.. only using html entitities like &uuml; works so far, but I don't like this as solution. #&228; should work, too, but does not, it generates "crappy chars". 'D\xc3\xbcsseldorf' will also pass, but create "crap" again.

What confuses me so much is that even the content-type is set to xml/utf-8 by you, etc. but the Fritzbox still does not accept Umlauts (but on the other hand invalid xml, see other ticket). If I debug the full envelope you just see the plain ü in it, or should it be otherwise?

kbr commented 3 years ago

Welcome in the encoding hell ;)

As far as I can see this now, the Soaper is missing the proper encoding. That's indeed a bug I will cover with the next bugfix release. May be at the weekend.

bufemc commented 3 years ago

Then I am really glad somehow, I don't have to puzzle any longer. I've been doubting myself a lot. Lost few hair. You could also point me to where the culprit is ,)

kbr commented 3 years ago

It's important to understand that Python 3 strings are arrays of unicode codepoints, utf-8 is just an encoding (one of the many) and that the utf-8 declaration in the xml header is not an encoding but a hint for the reader how to decode. But instead the soaper used the standard ascii encoding. Not a problem as long as with ascii just an utf-8 subset is used for communication – what covers the majority of the use cases. So the bug has not been detected for so long.

kbr commented 3 years ago

Fixed with 1.3.3 So I like to close this.

bufemc commented 3 years ago

Just if anyone would pass by later, again:

After doing pip install fritzconnection --upgrade

the situation is this now:

I can add something with Umlaut now, without any 502 error etc, but it arrived first in the phonebook as:

CallBlockerTest-Umläut

So I re-checked my envelope, and it is mandatory to have

                   f'<?xml version="1.0" encoding="utf-8"?>'
                   f'<Envelope xmlns:s="http://www.w3.org/2003/05/soap-envelope" s:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">'

in the beginning, so the "Umläut" arrives now correctly.

Thank you a lot! Best Fritz lib ever ,)