Open samwillis opened 2 years ago
Having the same issue where SIGINT closes the browser even with handle_sigint=False
. I have a long running program that would appreciate graceful shutdown, I do so with loop.add_signal_handler()
in asyncio, but SIGINT still kills the browser process.
Not as handy as SIGINT
, but I use SIGTERM
and it works doesn't kill the browser: https://github.com/ail-project/lacus/blob/main/bin/capture_manager.py#L66 & https://github.com/ail-project/lacus/blob/main/bin/stop_capture_manager.py
while True:
do_stuff(playwright)
sleep(long_time)
Look at this beautiful loop. I cancel it with Ctrl+C, then I handle the KeyboardInterrupt exception.
Except Ctrl+C messes with the playwright browser. That's not nice.
Alright, let's do handle_sigint=False, handle_sigterm=False
Alas, to no avail. God please send divine knowledge and wisdom to my python code.
@blksith0 as far as I can tell, it is a bug in Playwright, you cannot solve it on your side.
If I may, I'd recommend you to use SIGTERM
instead of SIGINT
(as I explained in the message above), and it then works fine. But you cannot use Ctrl+C.
@Rafiot Yes, I see that. I'm just not smart enough to know how I can send a SIGTERM to my program and use that to stop the loop.
It would be really great if I could terminate my loop and not have it kill the playwright browser. Can you please look into this?
While I'm aware what I'm going to write is not an exact solution to the mentioned problem, I've got a slightly different perspective due to the "other" stuff going on with playwright. Exactly like yous I needed these feature, because I wanted to handle playwright error separately and do something except for ctrl-c etc (mainly SIGINTs and SIGTERMs). I would expect them to generate asyncio.CancelledError
which doesn't derive from Exception
and is thus not accidentally caught unless you wanted to, and hence ending the application gracefully. That doesn't happen because while every other await gets the cancelled-error, awaits on playwright stuff gets a normal playwright-error about target being closed and so on.
However, playwright has another bug (hopefully soon to be solved after 2.5 yrs of it being an open issue) - it mem-leaks. It's a big thread but I face most problems mentioned there for long running applications. The TL;DR from there is that the browser will unexpectedly close after certain num of context-creates and destroys and mem just keeps rising (so it sometimes closes and sometimes crashes with the app with OOM in dockers/fargates but amounts to the same thing - the browser's gone).
Which means that the browser close not only happens on ctrl-c etc. but can happen due to other bugs like above. So instead of waiting for the solution from here (for ctrl-c to not raise playwright error due to browser close), I ended up doing the less desirable thing of catching playwright error, inspecting the string message and if it's anything like regex r'Target.*close.*'
(because there seems to be some permutations like Target closed.
, Target page, context or browser closed.
and so on) then raise the asyncio.CancelledError
myself from that exception.
Ofc this may not be suitable for everyone as their use case might be different, but for someone who otherwise wants to do something separate for playwright-errors except browser close (e.g you might want to backoff and retry if doing a web-scrape) and interpret browser-close as an unexpected irrecoverable scenario, then this is perhaps the most straightforward solution (unless somebody knows a better way ofc).
@ustulation that's also what I'm doing. In case you want a pretty complete list of possible errors, I have one there: https://github.com/Lookyloo/PlaywrightCapture/blob/main/playwrightcapture/capture.py#L837
following.
Still not fixed! Can not stop playwright correctly after KeyboardInterrupt, and infinite awation produced.
finally:
print('Closing...')
_BLOCK_IMAGES = _BLOCK_IMAGES_old
try:
if (context is not None):
await context.close()
context = None
if (playwright is not None):
await playwright.stop()
playwright = None
print(context, playwright)
except:
pass
print('Done.[Closing]')
Hello folks,
Here's an OS tip related to this issue.
Have you heard about PGID (Process Group ID)? When you execute your program in a terminal, and your program fork()-ed multiple child processes, than all of the processes have the same PGID unless you change it explicitly.
The OS kernel knows your processes are running at the foreground of the terminal and it also knows the PGID of them.
IF you type Ctrl+c in a terminal, the terminal driver of the OS kernel sends SIGINT signals to all processes that belong to the PGID.
If your program imports playwright package, and it creates playwright instances, some of playwright scripts are executed in a child process, and browser process runs in a child process also.
In this setup, if you hit Ctrl+c, then all of these processes receive SIGINT signal simultaneously. Even if your main program handles SIGINT signal and does some cleanup including shutting down playwright, you can not stop SIGINT not to signal the child processes.
So in this case, just send SIGTERM to your main process instead of hitting ctrl+c in the terminal.
I hope this tip helps.
I have been using Playwright with the Scrapy web scraping framework, this is the plugin: https://github.com/scrapy-plugins/scrapy-playwright
Scrapy is designed to cleanly shutdown on SIGINT, saving its crawl state so that it can be resumed. When used with Playwright the SIGINT immediately closes the browsers (with both
handle_sigint=False
orTrue
), this results in the currently processing pages crashing and the shutdown failing with the state not being saved. The ideal way to fix this is to prevent the SIGINT from being passed to the browsers so that scrapy-playwright can handle their clean shutdown as active pages finish being processed.I have successfully made this work on posix by monkey patching Playwright based on this from stackoverflow to include a
preexec_fn
:What would be ideal is either a flag that could be set to enable this (for both Posix and Windows) or a way to pass a custom
preexec_fn
when starting a browser.See the related bug on scrapy-playwright https://github.com/scrapy-plugins/scrapy-playwright/issues/62