zytedata / zyte-smartproxy-plugin

A plugin for playwright-extra and puppeteer-extra to provide Smart Proxy Manager specific functionalities.
https://www.npmjs.com/package/zyte-smartproxy-plugin
MIT License
3 stars 0 forks source link

ERR_CERT_AUTHORITY_INVALID with axios but curl works fine #5

Closed yzpaul closed 10 months ago

yzpaul commented 10 months ago

I am trying to use the SmartProxyPlugin with puppeteer and I get the error: net::ERR_CERT_AUTHORITY_INVALID at https://www.cvs.com I have already installed the certs using the zyte docs, and a curl request works fine.

Example curl format curl --proxy proxy.crawlera.com:8011 --proxy-user MY_USER_ID: --compressed -X POST -H "Content-Type: application/octet-stream" --data foo https://httpbin.org/anything | jq .data

packages installed

    "puppeteer": "^19.11.1",
    "puppeteer-extra": "^3.3.6",
    "puppeteer-extra-plugin-proxy": "^1.0.2",
    "puppeteer-extra-plugin-stealth": "^2.11.2",
    "zyte-smartproxy-plugin": "^1.0.8",
    "zyte-smartproxy-puppeteer": "^1.0.18"

CERT VERIFICATION

INSTALL CERT
sudo cp zyte-cert.crt /usr/local/share/ca-certificates/zyte-ca.crt
sudo update-ca-certificates

verify cert installed by running command WITHOUT -k
curl https://httpbin.org/get -U ZYTE_API_KEY: -vx proxy.crawlera.com:8011

CODE

import puppeteer from "puppeteer-extra";
const StealthPlugin = require("puppeteer-extra-plugin-stealth");
puppeteer.use(StealthPlugin());
//https://github.com/zytedata/zyte-smartproxy-plugin#quickstart-for-puppeteer-extra
const SmartProxyPlugin = require("zyte-smartproxy-plugin");

    let url=`https://www.cvs.com`

    puppeteer.use(SmartProxyPlugin({
       spm_apikey: process.env.ZYTE_PROXY_API_KEY,
       headers: {'X-Crawlera-No-Bancheck': '1', 'X-Crawlera-Profile': 'desktop', 'X-Crawlera-Cookies': 'disable'}, //use w/headless: true
       //static_bypass: false, //  enable to save bandwidth (but may break some websites)
       //spm_host:'proxy.crawlera.com:8014'
     }));

    const browser = await puppeteer.launch({
      headless: "new",
      defaultViewport: null
    });

    const page = await browser.newPage();

    await page.goto(url, {
      waitUntil: "domcontentloaded",
    });

    let pc=await page.content()
    console.log(`content len:`,pc.length,pc)
    if(pc.length<5000) throw `if its too short its VERY likely an error msg`
yzpaul commented 10 months ago

puppeteer launch config MUST have ignoreHTTPSErrors:true

    const browser = await puppeteer.launch({
      headless: "new",
      defaultViewport: null,
      ignoreHTTPSErrors:true
    });