Closed probably-not closed 4 months ago
This makes start_async unusable for any task that may take more than 5000 milliseconds.
I believe you are probably misunderstanding the shutdown option of Task Supervisors. Taken the following snippet
children = [
{Task.Supervisor, name: MyApp.TaskSupervisor}
]
Supervisor.start_link(children, strategy: :one_for_one)
Task.Supervisor.start_child(MyApp.TaskSupervisor, fn ->
Process.sleep(10000)
IO.puts("Slept 10s, all good!")
end)
Process.sleep(:infinity)
you can see that if run (e.g. save it as task.exs and run elixir task.exs
) the task completes successfully. The shutdown option of the Task.Supervisor controls how it handles termination of running tasks (-> using Task.Supervisor.terminate_child/2
). By default, it sends an exit signal and gives the task 5 seconds to cleanup before killing it, but this can be controlled using the :shutdown
option.
Here's an example:
children = [
{Task.Supervisor, name: MyApp.TaskSupervisor}
]
Supervisor.start_link(children, strategy: :one_for_one)
{:ok, pid} = Task.Supervisor.start_child(MyApp.TaskSupervisor, fn ->
Process.flag(:trap_exit, true)
receive do
{:EXIT, _from, _reason} ->
IO.puts("Exit received, cleaning up!")
Process.sleep(2000)
IO.puts("bye!")
end
end)
IO.puts("task started!")
Process.sleep(1000)
Task.Supervisor.terminate_child(MyApp.TaskSupervisor, pid)
Process.sleep(:infinity)
My mistake, you are absolutely right, I did misunderstand the shutdown value and definitely misrepresented the underlying issue.
However, the underlying issue of not being able to pass in the standard start_child
options still holds true - we can't affect the actual Task start configurations which includes both the shutdown and the restart values, both of which can have big implications of how the task runs and what tasks I can run using the LiveView.Async module.
For example, the shutdown value allows the user control over gracefully shutting down the task by ensuring that we have enough time to run whatever cleanup is necessary before the task is exited as you showed. I do think that is the same level of issue - if I have a task that I know can take more than 5 seconds, I need to ensure that the task can complete gracefully.
In the same vein, making sure that I can control the restart value is important based on the task type, whether I want it transient or temporary.
Not being able to pass in the standard options for a task supervisor limits (personally I would say severely but that's subjective because it depends on the tasks you want to run) the types of tasks that I can do in start_async
.
I can edit the original issue text to reflect the correct underlying issue if you'd like.
The restart option wont work as desired. For example, Task.Supervisor.async does not expose it either, because there is context exchanged between parent and child, which is not replayed on restart.
I am also skeptical of the shutdown option being useful. The child and parent are linked. During shutdown, the LiveView will be terminated before, causing the child to crash before it exits. You could trap exits, but that now means you can leak tasks once their parent LiveView terminates.
so overall, I donβt think the features will work as you expect or be free of trade-offs. I think it is fair to say that, if you want them, you can use a regular task and process communication, instead of backing those trade-offs inside the async mechanism.
I think it is fair to say that, if you want them, you can use a regular task and process communication, instead of backing those trade-offs inside the async mechanism.
That's fair, it's what I've been doing this whole time π I've been delving into LiveView and trying to use built in LV functionalities but I'm not afraid to get my hands dirty π
I am also skeptical of the shutdown option being useful. The child and parent are linked. During shutdown, the LiveView will be terminated before, causing the child to crash before it exits. You could trap exits, but that now means you can leak tasks once their parent LiveView terminates.
Doesn't the LiveView terminating before depend on the application supervision tree ordering? I've always been under the impression that if I place my Task.Supervisor at the end of the tree, it would terminate first, before its callers, so it would be able to handle graceful shutdowns... maybe I've misunderstood this as well?
In any case, I'm okay with implementing myself, all good!
I would say however that the task types that are recommended for the start_async
and the LiveView.Async functionalities should be documented, so that it's clear what to use it for and where you should reach for the base OTP process facilities.
As a very naive example (just to illustrate a task not suitable for this), based on this, I shouldn't use start_async
for a task that downloads a file, assigns something based on that, and deletes the file after reading it, because if I don't have a graceful shutdown mechanism then I could end up with files being downloaded and not being deleted, and filling up my disk space.
That's fair, it's what I've been doing this whole time π I've been delving into LiveView and trying to use built in LV functionalities but I'm not afraid to get my hands dirty π
Fantastic!
Doesn't the LiveView terminating before depend on the application supervision tree ordering?
You got it right but if you start the supervisor after the endpoint, there is a race condition where you can start receiving requests and the task supervisor is not yet running, and then you will get crashes.
I shouldn't use start_async for a task that downloads a file, assigns something based on that, and deletes the file after reading it, because if I don't have a graceful shutdown mechanism then I could end up with files being downloaded and not being deleted, and filling up my disk space.
In such cases, I would prefer to have another process that monitors the process that downloads the file and removes the file once the process concluded. This way you have a process only for cleaning up which should be more impervious to bugs. :) This also means you can continue using the async functionality and deal only with clean up separately!
You got it right but if you start the supervisor after the endpoint, there is a race condition where you can start receiving requests and the task supervisor is not yet running, and then you will get crashes.
True, although this is pretty easily mitigated by a health check. In previous applications that I've written, I've ensured that my healthiness endpoint would only return healthy after all of what I wanted to start was started - same solution, just putting a small process that switches a boolean flag only at the end of the tree, so until it starts the health endpoint returns an unhealthy status.
In such cases, I would prefer to have another process that monitors the process that downloads the file and removes the file once the process concluded.
Definitely agree!
Like I said though, this is a very naive example just to showcase that there are certain types of tasks that should simply not be done within the LiveView.Async functionalities, and I think the types of tasks should be reflected in the documentation to make it clear to newcomers who may be more junior in regards to architecting an application correctly. Many of the newcomers to LiveView don't know what Process.monitor
does, or the difference between linking and monitoring, or even what a Supervisor or a GenServer even are.
LiveView is getting a lot of press recently and in various communities I've been seeing people who are brand new to Elixir diving into LiveView without learning the basics of OTP and what a process is and just hoping for LiveView to work magically. Including information about what LiveView.Async is actually doing and what kind of tasks it should be used for should be included in the documentation to ensure that people don't get in to these types of issues.
More docs are always welcome!
It seems at this point we diverged considerably from the original issue, so what do you propose ?
Right, getting back on track to the issue here -
I propose that we should allow passing specific Task.Supervisor.start_child
opts to be passed in to the LiveView.Async functions. Based on our exchange we've established that there may be a use-case for the :shutdown
value (but not the :restart
value), since there are ways to ensure the shutdown order, and that does give the ability to do graceful shutdown handling for async tasks from a LiveView.
However, I completely get if the direction here is to push people to not do this and to prefer doing only specific types of tasks via LiveView.Async.
With that being said, if the decision is to not allow passing opts - is there any reason to allow passing a :supervisor
at all? The defaults of the Task.Supervisor essentially give the same behavior as no Task.Supervisor, so I'm not sure I understand the decision to allow passing a Task.Supervisor if we aren't going to allow passing opts. My feeling is that this simply confuses/complicates the usage... but to be fair, that's a subjective opinion.
And of course - regardless of any of the decisions - I do propose that the docs should be a lot more explicit here about what kind of tasks should be run with LiveView.Async's facilities, versus tasks that should prefer using standard OTP facilities and standard message passing to communicate with the LiveView. If we add support for passing the :shutdown
option in, then the docs should be explicit about all that you've mentioned (the linking and shutdown order and possible race condition and mitigation tactics, etc).
When it comes to options, I am a :-1: on the proposal. Yes, you can use shutdown if:
And, even after all of these changes, you can still "leak processes" once a LiveView terminates and there is still no guarantee of cleanup after crashes. There are too many pitfalls to encourage the behaviour you described. Overall, the async/await convention in Elixir is about short-lived tasks that send a response back, and LiveView does not change that. We should revisit if there are practical use cases.
is there any reason to allow passing a :supervisor at all?
Using different supervisors can be helpful for monitoring/instrumentation. For example, you can measure the number of children to know which kind of workloads are running. Sometimes you also need to drain supervisors in tests to avoid races and breaking the apart gives you more control.
With all of this said, feel free to send a pull request that sketches improvements to the docs, that's probably the best way to move this forward, as I believe you already have suggestions in mind. I will be glad to review and suggest improvements.
Makes sense!
I'm going to close this issue out for now, I'll submit a PR with documentation improvements based on what we spoke about soon.
Thanks!
Environment
Actual behavior
While
Phoenix.LiveView.start_async/4
allows passing a:supervisor
option to select aTask.Supervisor
to run the task under, it does not allow passing any of theTask.Supervisor.start_child/3
options to be passed to the Task.Supervisor, meaning that we cannot configure things like the timeout of the task (which defaults to 5000 milliseconds). This makesstart_async
unusable for any task that may take more than 5000 milliseconds.The
Task.Supervisor
docs do say that inTask.Supervisor.start_link
we can pass in:restart
and:shutdown
options - however, doing this is deprecated in favor of passing the options inTask.Supervisor.start_child
.Expected behavior
Using
Phoenix.LiveView.start_async/4
should allow passing options to the supervisor that we want to run against, to allow things like controlling the restart strategy and the shutdown strategy.