Closed dvic closed 1 year ago
Hi @dvic
could you give me a minimal example to reproduce this?
iex(1)> ChromicPDF.start_link(on_demand: true)
{:ok, #PID<0.314.0>}
iex(2)> {:ok, _} = ChromicPDF.print_to_pdf({:html, "test"}); nil
nil
iex(3)> flush
:ok
Would have expected to receive the :EXIT
message here. Am I missing anything?
The trick is to trap exits:
iex(1)> Mix.install([{:chromic_pdf, "~> 1.7.0"}])
:ok
iex(2)> Process.flag(:trap_exit, true)
false
iex(3)> ChromicPDF.start_link(on_demand: true)
{:ok, #PID<0.207.0>}
iex(4)> {:ok, _} = ChromicPDF.print_to_pdf({:html, "test"}); nil
nil
iex(5)> flush
{:EXIT, #PID<0.209.0>, :normal}
:ok
iex(6)>
Maybe I'm mistaken here and it's completely fine like this because you're supposed to ignore exits with reason :normal
? It's just that I was not expecting this situation to be triggered by a call to ChromicPDF.print_to_pdf
.
Oh you're trapping the exit signal. Yes in that case this makes sense. The supervisor that is spawned is linked to your client process and is terminated again with reason :normal
, so this signal is normally ignored by your process. That's not particular to ChromicPDF, this does the same:
iex(1)> Process.flag(:trap_exit, true)
false
iex(2)> spawn_link(fn -> :ok end)
#PID<0.113.0>
iex(3)> flush
{:EXIT, #PID<0.113.0>, :normal}
:ok
Compare this to without trap_exit
, where you don't get a message, but the signal is handled (i.e. :normal
reason -> ignore, otherwise crash). With trap_exit
, the signal is converted to a message, so yes, I'd say that you're supposed to ignore these. Also other libraries may exhibit similar behaviour if they spawn processes, right? Say something that spawns a Task
or so.
However, the more I think about it, the more I'm sceptical about the current implementation as well. Without on_demand
, any errors inside ChromicPDF (e.g. connection crash, at some point the supervisor itself will give up and exit) would not terminate the client process as the ChromicPDF supervisor and children are not linked to it. In on_demand
mode they are, which is a weird misalignment? Maybe this should be refactored, I imagine something like this would work:
ChromicPDF.OnDemandSupervisor
├─ <the agent that keeps the config>
├─ <a DynamicSupervisor to start the actual supervision tree dynamically>
This way the "on demand" tree wouldn't be linked to the client process. What do you think?
Oh you're trapping the exit signal. Yes in that case this makes sense. The supervisor that is spawned is linked to your client process and is terminated again with reason
:normal
, so this signal is normally ignored by your process. That's not particular to ChromicPDF, this does the same:iex(1)> Process.flag(:trap_exit, true) false iex(2)> spawn_link(fn -> :ok end) #PID<0.113.0> iex(3)> flush {:EXIT, #PID<0.113.0>, :normal} :ok
Compare this to without
trap_exit
, where you don't get a message, but the signal is handled (i.e.:normal
reason -> ignore, otherwise crash). Withtrap_exit
, the signal is converted to a message, so yes, I'd say that you're supposed to ignore these. Also other libraries may exhibit similar behaviour if they spawn processes, right? Say something that spawns aTask
or so.
Yeah that makes sense, I'm going to change the behaviour to ignore the :normal
exit reasons.
However, the more I think about it, the more I'm sceptical about the current implementation as well. Without
on_demand
, any errors inside ChromicPDF (e.g. connection crash, at some point the supervisor itself will give up and exit) would not terminate the client process as the ChromicPDF supervisor and children are not linked to it. Inon_demand
mode they are, which is a weird misalignment? Maybe this should be refactored, I imagine something like this would work:ChromicPDF.OnDemandSupervisor ├─ <the agent that keeps the config> ├─ <a DynamicSupervisor to start the actual supervision tree dynamically>
This way the "on demand" tree wouldn't be linked to the client process. What do you think?
Indeed, haven't thought about this other implication. The proposed solution sounds like a good idea to me. However, it would only not be linked to the caller if the max_restarts
is not exceed right? I'm not sure how DynamicSupervisor
differs from regular ones in terms of child restarts / error propagation.
However, it would only not be linked to the caller if the max_restarts is not exceed right?
No, there would be no link between the client and the ChromicPDF process spawned by the DynamicSupervisor. DynamicSupervisor is a GenServer and the process is spawned in the start_child
callback, not by the caller, see here. I'd interpet max_restarts
as the value after which the DynamicSupervisor itself gives up and crashes when one of its children do will notice, but I don't think this is what you meant).
Either way, I'd consider the on_demand
implementation "ok for now" and close this issue.
@dvic Maybe you've seen it already: Had some fun yesterday with DynamicSupervisor
& co but realized that linking the client process to the temporary browser process is actually a pretty good idea, as we might otherwise leave Chrome instances around when the client process dies (e.g. killed by external watchdog / timeout). So, to answer your original question: Yes, this is intended behaviour :upside_down_face: Only took me a while to figure this out.
Hah okay nice, one thing to consider you could also do here: instead of creating a link, you could monitor the client process and shutdown the supervisor whenever it dies.
Hah okay nice, one thing to consider you could also do here: instead of creating a link, you could monitor the client process and shutdown the supervisor whenever it dies.
Yes, I considered that but it's like 10x more complex for ... what exactly? Only for @dvic not to get EXIT
messages because he's trapping exits :smile: IMO, the behaviour of the linked process is exactly what we want, if the client does down, we want to pull down the temporary Chrome instance as well. Also semantically makes sense for me. Maybe we should just put a note in the docs mentioning this.
Hah okay nice, one thing to consider you could also do here: instead of creating a link, you could monitor the client process and shutdown the supervisor whenever it dies.
Yes, I considered that but it's like 10x more complex for ... what exactly? Only for @dvic not to get
EXIT
messages because he's trapping exits 😄
Hah fair enough 😂
IMO, the behaviour of the linked process is exactly what we want, if the client does down, we want to pull down the temporary Chrome instance as well. Also semantically makes sense for me.
I agree that if the client goes down, we want to pull down the temporary chrome instance as wel, but what if the temporary chrome instance crashes (so reason is not :normal
)? Do you then still expect the client process to crash as well? (assuming it is not trapping exits)
Maybe we should just put a note in the docs mentioning this.
I saw it in the PR, good idea 👍
Do you then still expect the client process to crash as well?
Yes, it would and IMO that's a reasonable default behaviour, especially considering there is no other error handling in place. I agree that raising an exception would be nicer to the client as it could be more easily rescued. But I'm not sure if the more complicated manual monitoring setup (both ways) would be worth it. :shrug: Would be happy to accept PRs though.
But I'm not sure if the more complicated manual monitoring setup (both ways) would be worth it. Would be happy to accept PRs though.
Not sure, I could work on this but before I take a swing at this, I wanted to ask one more thing: is there a reason (other than cleanup) for us to shutdown the browser instance with the on demand option? Would it be an option to further simplify it by just using a dynamic supervisor that holds the browser instance without shutting it down?
But I'm not sure if the more complicated manual monitoring setup (both ways) would be worth it. Would be happy to accept PRs though.
Not sure, I could work on this but before I take a swing at this, I wanted to ask one more thing: is there a reason (other than cleanup) for us to shutdown the browser instance with the on demand option? Would it be an option to further simplify it by just using a dynamic supervisor that holds the browser instance without shutting it down?
The main point of the on_demand
mode is to take down the browser after use, to avoid zombies. For instance when you run tests with ExUnit or when you exit an iex shell with ctrl+c, it doesn't cleanly shutdown your application, and without on_demand
the Chrome process becomes orphaned.
Clear, thanks for the explanation!
For the time being I think #235 is a fine solution. When I have the time I'll look into hooking a process monitor in this solution.
With the current
on_demand
implementation, the caller will receive an{:EXIT, _supervisor_pid, :normal}
signal, because the supervision tree is started on demand withSupervisor.start_link
and stopped right after the request withSupervisor.stop
.Is this something that's intended?
Background context:
I ran into issues with commanded after this PR https://github.com/commanded/commanded/pull/512. An event handler was calling ChromicPDF and was being shut down because of the above.