Closed shubham0204 closed 4 years ago
I have proposed a fix in PR #74
I have proposed a fix in PR #74
hi gcheron, i really thank you to try to fix it!! i just check your fixed code and then copy it to google.py but it still has same problem
`
def parse(self, response):
soup = BeautifulSoup(
response.content.decode('utf-8', 'ignore'), 'lxml')
image_divs = soup.find_all('script')
for div in image_divs:
txt = div.string
if txt is None or not txt.startswith('AF_initDataCallback'):
continue
if 'ds:1' not in txt:
continue
txt=re.sub(r"^AF_initDataCallback\({.*key: 'ds:(\d)'.+data:(.+)}\);?$",
"\\2", txt, 0, re.DOTALL)
meta = json.loads(txt)
data = meta[31][0][12][2]
uris = [img[1][3][0] for img in data if img[0] == 1]
return [{'file_url': uri} for uri in uris]
`
Exception in thread parser-002: Traceback (most recent call last): File "C:\Users\Rentalhub\anaconda3\envs\joong\lib\threading.py", line 917, in _bootstrap_inner self.run() File "C:\Users\Rentalhub\anaconda3\envs\joong\lib\threading.py", line 865, in run self._target(*self._args, self._kwargs) File "C:\Users\Rentalhub\anaconda3\envs\joong\lib\site-packages\icrawler\parser.py", line 104, in worker_exec for task in self.parse(response, kwargs): File "C:\Users\Rentalhub\anaconda3\envs\joong\lib\site-packages\icrawler\builtin\google.py", line 157, in parse meta = json.loads(txt) File "C:\Users\Rentalhub\anaconda3\envs\joong\lib\json__init__.py", line 348, in loads return _default_decoder.decode(s) File "C:\Users\Rentalhub\anaconda3\envs\joong\lib\json\decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "C:\Users\Rentalhub\anaconda3\envs\joong\lib\json\decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
I have proposed a fix in PR #74
Can confirm this PR fixes the issue for me.
@bar0191 really? it still doesn't work for me.. do you use Window? or Linux?
I have proposed a fix in PR #74
This is a fix for me. Ubuntu 18
I have proposed a fix in PR #74
This fix also worked for me. I use MacOS and run the tests with the virtual environment I created.
Resolved in #84
I have been using
icrawler
to scrap some images from Google Search. I have used this code,The execution ends with this exception,
I am using
icrawler
in Google Colab hence with Python version 3.6.9 on Google Chrome browser.