rawhat / mist

gleam HTTP server. because it glistens on a web
Apache License 2.0
310 stars 11 forks source link

mist_clock ETS crashes upon restart #45

Closed jhillyerd closed 5 months ago

jhillyerd commented 5 months ago

I'm working on a code example for using my new actor registry with mist and supervisors. But I'm finding that cannot be restarted by the supervisor, failing with:

...
{<<"Failed to accept/start handler">>,accept_error}
=ERROR REPORT==== 10-Apr-2024::09:45:02.310156 ===
{<<"Failed to accept/start handler">>,accept_error}
=ERROR REPORT==== 10-Apr-2024::09:45:02.300461 ===
Error in process <0.115.0> with exit value:
{badarg,[{ets,new,
              [mist_clock,[set,protected,named_table,{read_concurrency,true}]],
              [{error_info,#{cause => already_exists,
                             module => erl_stdlib_errors}}]},
         {mist@internal@clock,'-start/0-fun-1-',0,
                              [{file,"/home/james/devel/singularity/examples/supervised_mist/build/dev/erlang/mist/_gleam_artefacts/mist@internal@clock.erl"},
                               {line,38}]},
         {gleam@otp@actor,initialise_actor,2,
                          [{file,"/home/james/devel/singularity/examples/supervised_mist/build/dev/erlang/gleam_otp/_gleam_artefacts/gleam@otp@actor.erl"},
                           {line,182}]}]}

=ERROR REPORT==== 10-Apr-2024::09:45:02.310113 ===
{<<"Failed to accept/start handler">>,accept_error}
=ERROR REPORT==== 10-Apr-2024::09:45:02.310027 ===
{<<"Failed to accept/start handler">>,accept_error}
...

The WIP example is currently in the mist_sup branch of my repo, permalink:

https://github.com/jhillyerd/singularity/blob/mist_sup/examples/supervised_mist/src/supervised_mist.gleam#L55

The main branch version does not try to restart mist and is successful:

https://github.com/jhillyerd/singularity/tree/f5faf8a016152ef341fcd77cb9339896529678ca/examples/supervised_mist

rawhat commented 5 months ago

I'm gonna have to think about this some more, but I don't really have a good solution at all.

The ways to transfer ownership of ETS tables is with give_away or the heir option. I could try to set up a process that manages the table and give_aways it to the clock.

However, I'd like to manage all this stuff in my own supervisor, so users don't have to actually set up their own supervisors.

But if I do that, the whole tree will be killed / restarted. Which would include the table manager.

I don't see how bandit avoids this issue, if it does.

rawhat commented 5 months ago

Cool, this should be good now. I had to get a fix in for an entry in the gleam.toml file outputting incorrect Erlang.

I might do a release with a local gleam before the next release, if it's feeling like I'm hung up on that.

Either way this should be resolved, I think 😄 thank you!

nerdyworm commented 3 months ago

@rawhat - I think I found the root cause of this issue: https://github.com/gleam-lang/otp/pull/65

Basically any process that was started by a gleam supervisor was never exited when the tree restarted, thus would lead to a dangling table issue described here.

I ran into the same thing the other day and went down the rabbit hole.

The current solution, starting the clock under the app supervisor will work perfectly, so no need to change.