miyakogi / pyppeteer

Headless chrome/chromium automation library (unofficial port of puppeteer)
Other
3.56k stars 372 forks source link

Page goto returns None #299

Open derlin opened 4 years ago

derlin commented 4 years ago

Hi, For some URLs, the goto method does not fail, but still returns None.

For example:

Here is a minimal example (using pyppeteer 0.0.25 and Python 3.7.4):

import asyncio
from pyppeteer import launch

urls = [
    'http://www.swisscamps.ch/de/index.php',
    'http://www.whisky-club-oberwallis.ch/brennereien']

async def main():
    browser = await launch(headless=True)
    page = await browser.newPage()
    for url in urls:
        response = await page.goto(url, waitUntil='networkidle0')
        print(url, response)
    await page.close()
    await browser.close()

loop = asyncio.get_event_loop()
loop.run_until_complete(main())

Output:

http://www.swisscamps.ch/de/index.php None
http://www.whisky-club-oberwallis.ch None

Using curl or the chrome browser, I can still see the response headers and status codes. Any idea where this comes from / how to fix it ?

derlin commented 4 years ago

After some investigation, it seems like there is a gzp decoding issue. Shouldn't this raise an exception ?

Mattwmaster58 commented 4 years ago

Hi @derlin , it looks like this project has been abandoned. You may want to consider the active fork pyppeteer2. Your issue will persist with latest version, however, when the pup2.1.1 migration is complete, this issue will like be solved.