Open elbrujohalcon opened 7 years ago
Does canillita start a lot of workers under some supervisor? That's my first guess, because one of the plugins timed out while querying tracer process for children of one of the supervisors.
Or is that machine under heavy load, by not responding within the default of 5000ms ? From the crash it's doesn't look insanely busy. mmm
Neither of those.
(canillita@127.0.0.1)4> [begin timer:sleep(5000), length(processes()) end || _ <- lists:seq(1, 10)].
[132,132,132,132,132,132,132,132,132,132]
(no processes created for about a minute)
And the machine is my computer… super quiet and peaceful, I promise :)
you can certainly try it in your computers as well… canillita is open-source
Well, i see there's missing tests! i might volunteer later ;)
It looks like supervisor:which_children
is a gen_server:call
with timeout set to infinity under the hood. The only question is: why doesn't it return?
I suspect this line might be a problem. Canillita doesn't start its top level supervisor, it returns its own PID in start/2
application callback, so call to application_master:get_child/1
returns application master itself. Application master expects only certain messages (https://github.com/erlang/otp/blob/1526eaead833b3bdcd3555a12e2af62c359e7868/lib/kernel/src/application_master.erl#L347) and discards any message it doesn't understand. Since we're waiting infinitely, there is a timeout on the side of gen_server:call
.
In this case I'd recommend adding dumb top supervisor to canillita, if that wouldn't interfere with its design.
You can build erlangpl without failing plugin though. Clone the repo and follow these steps:
make ui
make rebar
make
rm -rf deps/epl_st
./bootstrap
This should build erlangpl
without plugin which causes the crash. Let me know if you encounter any more problems 🙂
We're working on the plugin system which would allow builidng erlangpl
with only selected plugins a breeze, so such cases will be much easier to troubleshoot.
Well, @arkgil … almost any app built with sumo_rest (like Canillita) will not have a supervision tree. They just don't need one.
I understand that returning self()
might be incorrect. What would be the idiomatic OTP-ish way to implement an app that doesn't require a supervision tree, at all?
I wouldn't call myself an authority when it comes to OTP, but in such cases I always start dumb dangling supervisor. Of course we could check if returned PID is not the same as application master's PID, but even Erlang Manual mentions that start/2
should return top supervisor's PID. Also all standard OTP applications follow this behaviour.
I'm trying to run observer with canillita to se how it handles such cases.
Good points, @arkgil. I'm looking for the right way to solve this issue with my colleagues. I agree that returning {ok, self()}
is not the way to go.
Setting up dummy supervisor isn't the most OTPish way too. In the end, why would we need one if there is nothing to supervise?
Would it help in anyway to be a bit more rebust? i've seen people do the {ok,self()} thing more than once.
We might check if pid returned by application_master:get_child/1
is not the same as application master's pid. But it's only one of the cases. As a placeholder somebody could start a process which loops infinitely and discards all messages, which will result in the same crash as reported in this issue.
I'd be very happy to fix this, so if you have any idea, please let us know here 🙂
Sure, any suggestions?
On the other hand I'm glad we dscovered this issue in Canillita. It manifests in obscure way, but indicates lack of OTP compliance.
When I try to analyze inaka/canillita, I found an error. These are the steps I follow:
…then, in another shell…
…where I get the following output…
but once I open my browser at localhost:8000, I get the following error report in that console: