Closed vladimir-mencl-eresearch closed 2 years ago
Hi @zmousm , works all fine in my TEST environment - are you happy to merge it?
Cheers, Vlad
After looking through the way requests
handles encoding, I think using r.text
is fairly safe. It first tries to detect the encoding based on the content, if that fails it falls back to assuming utf-8, replacing decoding errors with U+FFFD
REPLACEMENT CHARACTER. This is more than good enough in this situation, as all the code is really doing is making a best effort at fixing up a mistake from the CAT API.
Thanks @James-REANNZ !
Also given that the condition gets only evaluated when Content-Type is text/html
, it makes sense to use r.text
- I'll go with that.
Cheers, Vlad
Just realised for Python2, using r.text
would be inconsistent with the Requests API:
'['
, '{'
are byte strings.But, Python2 string equality testing returns True
for b'a' == u'a'
... and the condition still works.
Given Python2 is EOL for >2 years, and the test still works there, we can merge with r.text
....
I think it's OK. You could also from __future__ import unicode_literals
but it is not necessary just for this.
The code in cat_user_api_proxy changes Content-Type from
text/html
toapplication/json
if it detects the payload returned is JSON - starts with'['
or'{'
.However, this code was missed in the Python3 overhaul.
In Python Requests, the type of response.content is bytes.
The existing comparison:
r.content[0] in ['{', '[']
works in Python2, where strings and bytes are not distinguished.However, in Python3,
r.content[0]
returns just the value of the first byte (91
), which does not match up with single-character unicode strings.Fix this to work in both Python2 and Python3 by:
So the new check, compatible with both Python2 and Python3, is:
r.content[0:1] in [b'{', b'[']