Closed XD-coder closed 2 months ago
Playwrite support logins.
Hi, @XD-coder
In v0.1.7
a new class ParseraScript
has been added, it allows executing custom playwright scripts during scraping.
For example, you can log in to parsera.org and get your number of credits with the following code:
# Define the script to execute during the session creation
async def initial_script(page: Page) -> Page:
await page.goto("https://parsera.org/auth/sign-in")
await page.wait_for_load_state("networkidle")
await page.get_by_label("Email").fill(EMAIL)
await page.get_by_label("Password").fill(PASSWORD)
await page.get_by_role("button", name="Sign In", exact=True).click()
await page.wait_for_selector("text=Playground")
return page
# This script is executed after the url is opened
async def repeating_script(page: Page) -> Page:
await page.wait_for_timeout(1000) # Wait one second for page to load
return page
parsera = ParseraScript(model=model, initial_script=initial_script)
result = await parsera.arun(
url="https://parsera.org/app",
elements={
"credits": "number of credits",
},
playwright_script=repeating_script,
)
Websites that require a login are a huge pain in @ss. I think it would be a good idea to use a llm to find where to enter the user details or what https request to pass to login.