import asyncio
import threading
from playwright.async_api import async_playwright
lst = [
'https://www.sncf-connect.com/aide/contact#conseiller',
'https://www.sncf-connect.com/train/bons-plans/budget-mobilite?prex=homepage_footer',
'https://www.sncf-connect.com/aide/le-paiement-de-vos-billets-de-train-modes-de-paiement-acceptes',
'https://www.sncf-connect.com/conditions-generales-presentation-offres-agence'
]
async def run():
async with async_playwright() as p:
browser = await p.chromium.launch(
channel='chrome',
args=[
"--no-sandbox",
"--disable-dev-shm-usage",
"--blink-settings=imagesEnabled=false"
],
headless=False
)
await asyncio.gather(*(_scrape(browser, j) for j in lst))
async def _scrape(browser, url):
context = await browser.new_context()
page = await context.new_page()
async with page:
try:
await page.goto(url)
await page.evaluate("window.scrollTo(0, document.body.scrollHeight)")
html = await page.content()
except Exception as e:
html = ""
return html
def test_async():
asyncio.run(run())
p = threading.Thread(target=test_async)
p.daemon = True
p.start()
p.join()
Expected behavior
The code should execute successfully
Actual behavior
I use asyncio.run to execute my code and it works fine and exit successfully.
but when I create a child thread to execute same code, the whole process hangs forever
I found that the code is running to async_playwright end context, can no longer execute, from the background, all chrome processes have quit。
it's looks like the main process have failed to catch the child thread has finished, but why the crawl action is completed, the browser has exited, still determines that the thread is not finished.
But instead of using thread, I can simply execute asyncio.run to exit.
I hope someone can help me solve this problem. Thank you.
Version
1.35.0
Steps to reproduce
python version: 3.7.9
Expected behavior
The code should execute successfully
Actual behavior
I use asyncio.run to execute my code and it works fine and exit successfully. but when I create a child thread to execute same code, the whole process hangs forever
I found that the code is running to async_playwright end context, can no longer execute, from the background, all chrome processes have quit。 it's looks like the main process have failed to catch the child thread has finished, but why the crawl action is completed, the browser has exited, still determines that the thread is not finished.
But instead of using thread, I can simply execute asyncio.run to exit. I hope someone can help me solve this problem. Thank you.
Additional context
No response
Environment