Closed imfht closed 6 years ago
Use browser.newPage()
to make new page instance for each url.
browser = await launch()
for url in ['http://...', 'http://...', ...]:
page = await browser.newPage()
await page.goto(url)
# await page.waitFor("body > div.footer-up")
title = await page.title()
await page.screenshot(options={'path': "/tmp/%s.png" % str(uuid.uuid5(uuid.NAMESPACE_URL, url))})
print(title)
await browser.close()
The code below fails after just few iterations
async def main():
browser = await launch({"headless": False})
for i in range(100):
pprint(i)
page = await browser.newPage()
await page.goto('https://github.com/miyakogi/pyppeteer/issues/85')
await page.waitFor(1000)
await page.screenshot(options={'path': "/tmp/" + str(i) + ".png"})
await page.close()
await browser.close()
asyncio.get_event_loop().run_until_complete(main())
with the following exception: websockets.exceptions.ConnectionClosed: WebSocket connection is closed: code = 1006 (connection closed abnormally [internal]), no reason Traceback (most recent call last): ... File "/usr/local/lib/python3.5/dist-packages/pyppeteer/connection.py", line 176, in send method), '): Session closed. Most likely the ', '{}'.format(self._targetType), ' has been closed.'])) pyppeteer.errors.NetworkError: Protocol Error (Target.activateTarget): Session closed. Most likely the page has been closed.
Can you tell me what's wrong?
Hi, I need crawl lots of urls use headless chrome and python. I think launch a browser for each url is not a good idea.
I would appreciate any ideas. Thanks so much!