Closed SwitchGM closed 3 years ago
I think you need to handle certificates errors as shown here or use HTTP.
The following works for me:
const CDP = require('chrome-remote-interface');
const chromeLauncher = require('chrome-launcher');
(async function () {
const chrome = await chromeLauncher.launch({
port: 9222,
chromeFlags: [
'--headless',
'--proxy-server=SCHEME://IP:PORT'
]
});
const client = await CDP();
const {Page, Runtime, Security} = client;
Security.certificateError(({eventId}) => {
Security.handleCertificateError({
eventId,
action: 'continue'
});
});
await Security.enable();
await Security.setOverrideCertificateErrors({override: true});
await Page.enable();
await Page.navigate({url: 'https://whatismyip.akamai.com/'});
await Page.loadEventFired();
const {result: {value}} = await Runtime.evaluate({
expression: 'document.body.innerHTML'
});
console.log(value);
await chrome.kill();
})();
I've used the code you shared, and used the first 3 proxy servers I could find from: http://www.freeproxylists.net/, however I am still returned my IP ? Is there something I'm missing here, I'm running this locally on my machine.
To be absoloutly certain, I've used chromeFlags: ['--headless', '--proxy-server=HTTP://13.92.119.142:80']
, from
Could I be misinterpreting the --proxy-server values ?
Does it work if you manually start Chrome in that way then navigate to https://whatismyip.akamai.com/? I suspect that Windows overrides the proxy choice.
I had a go at creating a chrome instance through the command line with the --proxy-server=HTTP://13.92.119.142:80
option, which failed to connect.
I then changed to use --proxy-server=13.92.119.142:80
removing the SCHEME. In that later case I was able to connect through the proxy server. I gave it a try with the script you shared which fortunatly worked.
I did have some issues leading to it, such as delays in nagivating to the page which still persist (i assume this is just a normal thing when using proxies).
It does seem that overriding certificate errors is mandatory when doing this, as commenting out the relevant security code didn't seem to let me connect to the proxy.
P.S: for anyone stumbling on this with similar issues, check the code that cyrus-and provided, as well as the list of free proxy servers that I provided for testing with.
I did have some issues leading to it, such as delays in nagivating to the page which still persist (i assume this is just a normal thing when using proxies).
Those free proxies are not reliable at all, what do you really want to achieve, if I may ask? There could be some alternatives.
Those free proxies are not reliable at all, what do you really want to achieve, if I may ask? There could be some alternatives.
I did take a look at some private proxies to use, https://www.webshare.io/private-proxy. Would something like this be more reliable than the free stuff ? Project is just scraping sites
If the goal is to simply hide your IP from the final host you might consider using a VPN (there are some free ones up to certain GB of traffic) or even use TOR. Bear in mind that, especially in the former case (or any other proxy provider, like the one you linked), real privacy cannot be achieved, you simply chose who to trust and cross your fingers. This might or might not be enough for you.
I've decided to give TOR a try, are you aware of any examples / tutorials that I could follow to achieve this still using CRI ?
Just run TOR (a SOCKS5 proxy) then use --proxy-server=socks://localhost:9050
.
Before I start using TOR, is there a method of specifying the specific instance of chrome (in this case a chrome browser that uses a specific proxy server) that you want to use for your CRI code.
A similar thing is avaliable with puppeteer in which you can set the options for chrome, and then create a "browser" from those options.
const puppeteer = require('puppeteer');
(async () => {
const options = {
headless: true,
args: [
'--disable-gpu',
'--no-sandbox'
],
};
const browser = await puppeteer.launch(options);
// do puppeteer stuff
})();
As stated before, I'm using chrome launcher which (upon launching and passing it the relevant chrome options) returns what I assume is an instance of the browser.
let chrome = await chromeLauncher.launch({
port: 9222,
chromeFlags: ["--disable-gpu", "--headless", "--enable-logging"]
});
I'm wondering whether there is a similar way of doing this (like puppeteer) before I delve into using TOR ?
Just use a different port (if you need) an use that port with CRI:
const client = await CDP({port: 1234});
This seems to work perfectly, thank you! I'm wondering whether CRI / CDP has a method that allows me to authenticate a proxy username / password in headless mode ? I'm recieving a 407 response (proxy authentication required).
EDIT: So far I have found this https://groups.google.com/a/chromium.org/g/headless-dev/c/KOR84u-FNU0/m/TGc6HVbwBAAJ, for the dev tools protocol, using Network.requestIntercept ?
EDIT 2: https://chromedevtools.github.io/devtools-protocol/tot/Network/#type-AuthChallenge Here's the exact section in the CDP for authenticating proxies, unfortunatly I'm not sure how to write this in code.
EDIT 3: https://github.com/cyrus-and/chrome-remote-interface/blob/master/lib/protocol.json found AuthChallange, and AuthChallangeResponse from some place in the CRI repo, I'm fairly sure that this is possible now, just not sure how to go about doing then when recieving a 401 or 407 error
EDIT 4: Possibly getting closer here, using puppeteer as a reference as it has a #authenticate method for private proxy servers, after a bit of digging (initially scoped around inside of the #authenticate method itself, and wasn't able to make too much sense) I found an auth method that seems to be related to proxies https://github.com/puppeteer/puppeteer/blob/49f25e2412fbe3ac43ebc6913a582718066486cc/utils/testserver/index.js#L188. The function seems to be used later here https://github.com/puppeteer/puppeteer/blob/49f25e2412fbe3ac43ebc6913a582718066486cc/src/common/NetworkManager.ts#L194, unfortunatly I don't understand fully the parameters passed.
Sorry for the late reply, here you go, this should work:
const CDP = require('chrome-remote-interface');
const PROXY_AUTH = {
username: 'user',
password: 'password'
};
CDP(async (client) => {
const {Fetch, Network, Page} = client;
// provide credentials when needed
Fetch.authRequired(({requestId}) => {
Fetch.continueWithAuth({
requestId,
authChallengeResponse: {
response: 'ProvideCredentials',
...PROXY_AUTH
}
});
});
// just continue any other requests
Fetch.requestPaused(({requestId}) => {
Fetch.continueRequest({requestId});
});
// enable requests interception
await Fetch.enable({handleAuthRequests: true});
// usual demo code below...
Network.requestWillBeSent((params) => {
console.log(params.request.url);
});
try {
await Network.enable();
await Page.enable();
await Page.navigate({url: 'https://github.com'});
await Page.loadEventFired();
} catch (err) {
console.error(err);
} finally {
client.close();
}
}).on('error', (err) => {
console.error(err);
});
Exactly what I was looking for, thank you again for the assistance
Is Chrome running in a container?
YES/ NOAttempting to connect to webpages through a free proxy server, using sites like:
https://www.proxynova.com/proxy-server-list http://www.freeproxylists.net/ https://free-proxy-list.net/
For this I've been using Chrome Launcher to set up the headless chrome instance, and then some really basic chrome remote interface stuff to test that the IP address has changed. I'm checking my IP address of the headless chrome instance using this website https://whatismyipaddress.com/. Following similarly to https://www.youtube.com/watch?v=wAyocwixpFA.
Code that I'm using to connect, use proxy and check IP
In this case with and without using the chrome flag
--proxy-server=PROXY_IP:PROXY_PORT
, I always retrieve my IP address, and never get the address of any of the proxy servers. I've also tried different variations of the --proxy-server option: You can find the three here https://www.chromium.org/developers/design-documents/network-settings#TOC-Command-line-options-for-proxy-settings--proxy-server=PROXY_IP:PROXY_PORT eg; (--proxy-server=255.255.255.255:8080) --proxy-server=SCHEME=PROXY_IP:PROXY_PORT eg; (--proxy-server=https=255.255.255.255:8080) --proxy-server=LINK_TO_PROXY eg; (https://255.255.255.255:8080)
I've tried all of these, including adding the "double quotes" around the values, and all have given me the same result, I've also tried using a random string (in attempt to get an error) for the proxy ip, and again get the same result. I'm not sure whether this is a chrome-launcher issue or a chrome-remote-interface issue either. In any case, I never recieve an error from chrome-launcher or chrome-remote-interface, and the script just spits out my IP address.
Any help with this would be greatly appreciated, if you need any more information I can provide this asap.
EDIT: For anyone with the same issue, I was able to solve this issue using
--proxy-server=PROXY_IP:PROXY_PORT eg; ( --proxy-server=255.255.255.255:8080 )
, and the security code provided below by cyrus-and. I would sugest using a free proxy for testing from http://www.freeproxylists.net/, atleast proxies from there worked for me