Closed ksmeeks0001 closed 3 years ago
@ksmeeks0001 Maybe the response data is gzip-compressed. If its gzip-compressed you have to use gzip.decompress(response.body).decode("utf-8"). Maybe use try-except to decompress and if the error is caught do a decode only without decompress.
@ksmeeks0001 adding to @pawanpaudel93 comment, you can ask the server to disable compression with the option disable_encoding
:
options = {
'disable_encoding': True # Tell the server not to compress the response
}
driver = webdriver.Firefox(seleniumwire_options=options)
Also, before you attempt to decode the body, you need to ensure that it is actually a binary string. You should probably check the content type header first:
for request in browser.requests:
if request.response and request.url[-3:] not in ['png', 'gif', 'jpg']:
print(request.response.headers.get('Content-Type'))
if request.response.headers.get('Content-Type', '').startswith('application/json'):
print(request.url,
# response body is a bytes object that needs decoded to a string
request.response.body.decode('utf-8')
)
This problem occurs in version 3.0.2. Version 2.1.2 works without problems. request.response.body is b''
in 3.0.2.
b''
is a valid byte string and b''.decode('utf-8')
shouldn't cause a problem.
I suspect this issue is happening because automatic body decoding has been switched off in 3.0.2 - but you can still enable it manually with the disable_encoding
option as described above. I'll look at switching body decoding back on again if it's causing issues.
Version 3.0.3 now released which has automatic content decoding re-instated.
That doesn't seem to be a problem because version 3.0.3 doesn't work either.
@wkeeling ,
Yes the disable_encoding
option was exactly what I needed. Thank you.
@Ylodi are you able to share your code? I think something else is perhaps happening.
Python 3.8.6 (Linux) Example code:
import json
from seleniumwire import webdriver as wire
def test_json_decode():
driver = wire.Chrome()
driver.get('https://gurushots.com/challenge/peaceful7/rank/top-photographer')
request = driver.wait_for_request(
'/rest/get_top_photographer',
30
)
data = json.loads(request.response.body.decode('utf-8'))
driver.close()
test_json_decode()
Thanks @Ylodi The issue is due to an OPTIONS request made by Chrome just before it makes the real request. The response to the OPTIONS request has a zero byte body and Selenium Wire captures that and returns it from driver.wait_for_request()
. To fix, add the ignore_http_methods
option:
options = {
'ignore_http_methods': ['OPTIONS']
}
driver = wire.Chrome(seleniumwire_options=options)
In versions before v3.0.0 Selenium Wire filtered out OPTIONS requests by default, but that was also causing some issues for people so from v3.0.0 onwards Selenium Wire captures all requests including OPTIONS. Given that OPTIONS requests are largely useless perhaps it would be better if we revert to filtering them by default and just make it clearer in the docs.
Thanks, it's working now when OPTIONS requests are ignored.
I need to decode the response bodys in order to parse the json. Getting UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
What can I do to get ajax response as string?