python / cpython

The Python programming language
https://www.python.org
Other
62.51k stars 30.01k forks source link

http.client not allowing non-ascii in headers #85824

Open f2df1ebc-10cf-40b9-a494-e2cfeef26fe6 opened 4 years ago

f2df1ebc-10cf-40b9-a494-e2cfeef26fe6 commented 4 years ago
BPO 41658
Nosy @tirkarthi

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['3.8', 'library'] title = 'http.client not allowing non-ascii in headers' updated_at = user = 'https://bugs.python.org/yellalena' ``` bugs.python.org fields: ```python activity = actor = 'yellalena' assignee = 'none' closed = False closed_date = None closer = None components = ['Library (Lib)'] creation = creator = 'yellalena' dependencies = [] files = [] hgrepos = [] issue_num = 41658 keywords = [] message_count = 3.0 messages = ['376047', '376060', '376066'] nosy_count = 2.0 nosy_names = ['xtreak', 'yellalena'] pr_nums = [] priority = 'normal' resolution = None stage = None status = 'open' superseder = None type = None url = 'https://bugs.python.org/issue41658' versions = ['Python 3.8'] ```

f2df1ebc-10cf-40b9-a494-e2cfeef26fe6 commented 4 years ago

http.client trying to decode any header with 'latin-1', which fails when there is any non-ascii symbols in it, for example, Cyrillic.

I propose to check if it's non-ascii and then decode it with 'utf-8', works perfectly.

tirkarthi commented 4 years ago

Can you please add a short script explaining the problem? There were some recent security issues fixed in http.client disallowing non-ascii headers bpo-39603

f2df1ebc-10cf-40b9-a494-e2cfeef26fe6 commented 4 years ago

hi Karthikeyan Singaravelan! I'm working with a russian database called 1C. it's pretty popular here in Russia, and its 'twist' is that everything there (I mean code) is written in Russian, i.e. cyrillic. So it's obvious and normal that the request/response coming from 1C base could contain non-ascii characters in its parts. Particularly, my case was that the header has a header containing info on called method, which value was "http://www.1c-bitrix.ru#SVD_ВебСервис:GetEmployee". which causes the "'latin-1' codec cant encode characters in position 29-37: ordinal not in range(256)" exception every time I try to send a request there. I tested locally, and same happening if I'm trying to add a Cyrillic header when creating a request/response in Python.