Closed vovtz closed 3 years ago
It's because the underlying python-requests library defaults to ISO-8859-1
when no charset is specified in Content-Type
:
>>> import requests
>>> r = requests.get('https://zoek.officielebekendmakingen.nl/kst-34200-14/metadata_owms.xml')
>>> r.headers['Content-Type']
'text/xml'
>>> r.encoding
'ISO-8859-1'
I guess HTTPie could try to obtain the encoding information from the XML declaration.
Now part of #1022
HTTPie seems to assume ISO-8859-1 when decoding XML in UTF-8 form. Below I show the difference between the HTTPie and cURL outputs. The last element displays as “Financiën | Begroting”, respectively “Financiën | Begroting” (Dutch for “Finance | Budget”).
$ http 'https://zoek.officielebekendmakingen.nl/kst-34200-14/metadata_owms.xml'
$ curl 'https://zoek.officielebekendmakingen.nl/kst-34200-14/metadata_owms.xml'