Open adudew852 opened 1 year ago
Hi @adudew852 , You can take a look at the following:
Thanks. I just tried the most traditional way to add the proxy to my python + playwright code but the page is not loading... it's fine if I remove the proxy.
I used the API key as the username and left the password as 'empty'. Any idea why? Many thanks in advance.
Code below for your reference.
async def main(): async with async_playwright() as p: browser = await p.webkit.launch( headless=False, slow_mo=50, proxy={ "server": 'proxy.zyte.com:8011', "username": 'API Key', "password": '', } ) context = await browser.new_context() page = await context.new_page() await stealth_async(page) response = await page.goto(url, timeout=60 * 1000) print(response.headers)
await page.screenshot(path="demo.png")
await asyncio.sleep(500)
# browser.close()
asyncio.run(main())
Thanks for sharing the code @adudew852. The above code works for me as expected.
What is the error you receive when using proxy?
I run into the below error...
playwright._impl._api_types.Error: Failure when receiving data from the peer
Is that the complete error message? Also, have you installed certificate to access https pages via SPM?
yes, I have installed the certificate. Here's the error message. Thanks.
Traceback (most recent call last):
File "C:\Users\admin\PycharmProjects\ABC\pw_buy-zyte.py", line 93, in
=========================== logs ===========================
navigating to "https://www.google.com", waiting until "load"
============================================================
Do you receive the same error with other URLs?
@storymode7 Yes, same error with other URLs. I was wondering if it is the way I pass an empty password is incorrect. I currently do this.
"password": '',
That shouldn't be a problem. I'm using the script below to try to reproduce the issue. Could you confirm you receive an error with this too?
import asyncio
from playwright.async_api import async_playwright
from playwright_stealth import stealth_async
url = "https://quotes.toscrape.com/"
async def main():
async with async_playwright() as p:
browser = await p.webkit.launch(
headless=False,
slow_mo=50,
proxy={
"server": "proxy.zyte.com:8011",
"username": "API Key",
"password": "",
}
)
context = await browser.new_context()
page = await context.new_page()
await stealth_async(page)
response = await page.goto(url)
print(response.headers)
await page.screenshot(path="demo.png")
# await asyncio.sleep(50)
# browser.close()
asyncio.run(main())
Thanks for following up. Not sure what went wrong. I tried pip installing all the relevant packages again, run your code and still the same error. Tried changing the browser to firefox and same error. Strangely, I tried chromium and the page loaded with a timeout error. error log below. thanks for the help on this again.
Traceback (most recent call last):
File "C:\Users\admin\PycharmProjects\ABC\zyte-test.py", line 28, in
navigating to "https://quotes.toscrape.com/", waiting until "load"
============================================================
Process finished with exit code 1
I'm sorry but I'm unable to reproduce the issue.
Could you try the following things:
curl -LvU API_KEY: -x proxy.zyte.com:8011 'http://quotes.toscrape.com'
Thanks for this. I reinstalled all the relevant package and the SSL cert and finally got the proxy working but ran into the below issues with the now... are you able to help?
1) CORS issues, which Playwright blocks fetch request to get authentication tokens from a different domain. I would imagine Zyte proxy should be able to solve this type of CORS issues. 2) I added a number of x-crawlera parameters to the request header per (https://docs.zyte.com/smart-proxy-manager.html#request-headers) but these headers are passed directly to the target website, which caused the http request to be blocked... 3) Not sure if it is because of issue 1 & 2, javascripts on the page is not loaded by Playwright.
import asyncio
from playwright.async_api import async_playwright
from playwright_stealth import stealth_async
url = "https:// < url > "
async def main():
async with async_playwright() as p:
browser = await p.chromium.launch(
headless=False,
slow_mo=50,
proxy={
"server": "proxy.zyte.com:8011",
"username": "<api key>",
"password": "",
}
)
context = await browser.new_context()
await context.set_extra_http_headers({
'X-Crawlera-Profile': 'mobile',
'X-Crawlera-Profile-Pass': 'it_IT',
'X-Crawlera-No-Bancheck': '1',
'X-Crawlera-Cookies': 'disable',
'X-Crawlera-Session': 'create',
})
page = await context.new_page()
await stealth_async(page)
response = await page.goto(url)
print(response.headers)
await page.screenshot(path="demo.png")
await asyncio.sleep(500 * 1000)
# browser.close()
asyncio.run(main())
Hi @adudew852,
I'm able to run the above script as is with "http://quotes.toscrape.com/" after commenting the asyncio.sleep
.
Were you able to reproduce the issue with the suggestions from my last comment.
This could be a playwright issue AFAIK. Have you tried any other proxies with this setup?
Hi - my script is written in python. Is there a python version of this plugin, so that I can easily integrate the smartproxy into my existing program? Thanks.