whitfin / local-cluster

Easy local cluster creation for Elixir to aid in unit testing
MIT License
226 stars 30 forks source link

Can't load libcluster configuration #13

Closed elvanja closed 4 years ago

elvanja commented 4 years ago

Hi, I'm using https://github.com/bitwalker/libcluster to setup, well, the cluster 😄 But, for some reason, creating nodes from test via LocalCluster.start_nodes("my_cluster", 2) doesn't pick up libcluster configuration.

In config/config.exs I have:

config :my_app, MyAppWeb.Endpoint,
  ...

config :libcluster,
  topologies: [
    local: [
      strategy: Cluster.Strategy.Gossip,
      connect: {:net_kernel, :connect_node, []},
      disconnect: {:erlang, :disconnect_node, []},
      list_nodes: {:erlang, :nodes, [:connected]}
    ]
  ]

And in application.ex I use topologies = Application.fetch_env!(:libcluster, :topologies) to fetch that configuration and supply it to libcluster. However, :libcluster configuration is empty for nodes created via LocalCluster. At the same time, it is properly picked up when project is started in dev/prod mode and manager node started with tests also picks it up correctly.

In test helper I have:

:ok = LocalCluster.start()
Application.ensure_all_started(:my_app)
ExUnit.start()

Note that if I change the configuration to config :my_app instead of config :libcluster, it all works correctly (configuration is picked up). So there is a workaround, but I'd still like to be able to set the config key properly. Funny thing is, e.g. config :logger is correctly picked up.

elvanja commented 4 years ago

If it helps, the #5 solves the problem. With rpc.(Mix.CLI, :main, []) the configuration is correctly picked up by new nodes.

That addition from the PR solves the problem. But, if entire PR is accepted, stuff like IO.inspect and Logger.error from nodes doesn't work any more (no output in the console). I needed to keep this part as well:

    for { app_name, _, _ } <- Application.loaded_applications() do
      for { key, val } <- Application.get_all_env(app_name) do
        rpc.(Application, :put_env, [ app_name, key, val ])
      end
      rpc.(Application, :ensure_all_started, [ app_name ])
    end

The logger related setup can be left out, as proposed by that PR.

elvanja commented 4 years ago

Update: using the rpc.(Mix.CLI, :main, []) trick didn't work after all. In the end I solved the problem by putting the needed configuration under main app configuration key (e.g. config :my_app, topologies: ...). But, luckily libcluster doesn't require the configuration to be fixed, which may not hold for other libs.

keathley commented 4 years ago

The fact that libcluster config isn't getting set up on the nodes would seem to imply that it's not loaded by the time Application.loaded_applications() is called. Which is very interesting. I wonder if that has something to do with test setup or some internal thing in libcluster. It could also be a race condition I suppose but that seems slightly less likely.

elvanja commented 4 years ago

Small update: I've published https://github.com/elvanja/abbr, a small test/demo project that exhibits the mentioned problems.

pmenhart commented 4 years ago

If the problem is caused by a race condition or ordering of configs vs. loads: PR #15 has a side effect that could help you: all configuration is merged before loaded applications are started in child nodes. Can you try it, please? (No change to your test is needed, you can still use LocalCluster.start_nodes/2)

elvanja commented 4 years ago

Hah, looks like it works now even with main release!

Got me thinking why it would work correctly now, and traced it back to https://github.com/elvanja/abbr/commit/776e147945e084b5e0d9f2dcac2e4d02cd8a8a69. That commit works with config :libcluster too, configuration is picked up correctly in test generated nodes.

So it looks like all that trouble was caused by rouge Ecto repo application. It wasn't really used but having it's setup and configuration lying around and probably being automatically started somewhere in the chain somehow caused trouble down the road, resulting in problems with copying configuration to test nodes.

Removing all traces of repo usage, like in that commit, allowed usage of config :libcluster. So indeed https://github.com/whitfin/local-cluster/issues/13#issuecomment-579769743 was on the right track 😄

Closing this issue since it's no longer relevant. Apologies for all the comotion!