Talking about the docker image above. Talked to a zyte rep to tell them that
docker run $IMAGE_NAME -a $APIKEY did not work from the instructions I followed with this repo.
and got the following error in the console:
Error: net::ERR_PROXY_CONNECTION_FAILED at https://toscrape.com/
The zyte chad support rep straight up gave me this to run
docker run --name crawlera-headless-proxy -p 3128:3128 scrapinghub/crawlera-headless-proxy -d -u proxy.crawlera.com -o 8011 -a $APIKEY --direct-access-hostpath-regexps="(.pagead2.googlesyndication.com.$|.accounts.google.com.$|.dl.google.com.$|.clients2.google.com.$|.*?\.(?:txt|css|eot|svg|gif|ico|jpe?g|js|less|mkv|min|mp4|mpe?g|png|ttf|webm|webp|woff2?)$)" -x profile=desktop -x cookies=disable -x timeout=180000
AND IT WORKED. Able to use the proxy with puppeteer headless browser. If you're reading this hope it helps :)
https://hub.docker.com/r/scrapinghub/crawlera-headless-proxy
Talking about the docker image above. Talked to a zyte rep to tell them that
docker run $IMAGE_NAME -a $APIKEY
did not work from the instructions I followed with this repo.I tried running the sample script that was given:
and got the following error in the console:
Error: net::ERR_PROXY_CONNECTION_FAILED at https://toscrape.com/
The zyte chad support rep straight up gave me this to run
docker run --name crawlera-headless-proxy -p 3128:3128 scrapinghub/crawlera-headless-proxy -d -u proxy.crawlera.com -o 8011 -a $APIKEY --direct-access-hostpath-regexps="(.pagead2.googlesyndication.com.$|.accounts.google.com.$|.dl.google.com.$|.clients2.google.com.$|.*?\.(?:txt|css|eot|svg|gif|ico|jpe?g|js|less|mkv|min|mp4|mpe?g|png|ttf|webm|webp|woff2?)$)" -x profile=desktop -x cookies=disable -x timeout=180000
AND IT WORKED. Able to use the proxy with puppeteer headless browser. If you're reading this hope it helps :)