elixir-geolix / geolix

IP information lookup provider
Apache License 2.0
190 stars 18 forks source link

Using remote database source results in time out when app first starts #20

Closed mcade closed 6 years ago

mcade commented 6 years ago

When I config the database source as a link to a remote file I can't use the Geolix.lookup/2 function until the database has been loaded from the remote source.

How can I check if the remote geoip database has been loaded before I run any functions like Geolix.lookup/2?

This is my current config:

config :geolix,
  databases: [
    %{
      id: :city,
      adapter: Geolix.Adapter.MMDB2,
      source: "https://example.com/geoip/GeoLite2-City.mmdb"
    },
    %{
      id: :country,
      adapter: Geolix.Adapter.MMDB2,
      source: "https://example.com/geoip/GeoLite2-Country.mmdb"
    }
  ]

And the timeout error message:

[error] GenServer #PID<0.603.0> terminating
** (stop) exited in: GenServer.call(Geolix.Database.Loader, :loaded, 5000)
    ** (EXIT) time out
    (elixir) lib/gen_server.ex:834: GenServer.call/3
    (geolix) lib/geolix/server/worker.ex:26: Geolix.Server.Worker.lookup_all/2
    (geolix) lib/geolix/server/worker.ex:20: Geolix.Server.Worker.handle_call/3
    (stdlib) gen_server.erl:636: :gen_server.try_handle_call/4
    (stdlib) gen_server.erl:665: :gen_server.handle_msg/6
    (stdlib) proc_lib.erl:247: :proc_lib.init_p_do_apply/3
Last message (from #PID<0.604.0>): {:lookup, '127, 0, 0, 1', [as: :struct, locale: :en, where: nil]}

The readme mentions: "Note: Please be aware of the drawbacks of remote files! You should take into account the startup times as the file will be requested during GenServer.init/1. Unstable or slow networks could result in nasty timeouts." When I start my app however I don't experience any extra startup time. I'm using Elixir 1.6 and phoenix 1.3.

mneudert commented 6 years ago

The actual wording was done "back in the days" where the databases where loaded completely synchronous. This has since been changed to an asynchronous loading triggered directly after the application was started.

This change results in the behaviour you are seeing: fast startup and then those timeouts until the actual load was completed. I will try to offload the actual loading to a different process so this bottleneck goes away.

Until then I can only suggest some workaround using either accessing internal functions (perhaps suitable if you are careful and cannot store files locally):

# returns the database internal metadata
# should be `nil` until the database is completely loaded
# might be a bottleneck as it calls an Agent shared with the actual lookups
Geolix.Adapter.MMDB2.Storage.Metadata.get(:city)

or some download-reload-trigger (untested but might be a quick hack to get it working):

config :geolix, init: {MyInitializer, :download_and_reload}

defmodule MyInitializer do
  def download_and_reload() do
    Task.start(fn ->
      {:ok, _} = Application.ensure_all_started(:inets)

      case :httpc.request('https://example.com/GeoLite2-City.mmdb') do
        {:ok, {{_, 200, _}, _, body}} ->
          priv_dir = Application.app_dir(:my_app, "priv")
          db_file = Path.join([priv_dir, "GeoLite2-City.mmdb"])

          :ok = File.write!(db_file, body)
          :ok = Application.put_env(:geolix, :databases, [%{
            id: :city,
            adapter: Geolix.Adapter.MMDB2,
            source: db_file
          }])

          :ok = Geolix.reload_databases()

        {:error, err} ->
          IO.inspect({:error, {:remote, err}})
      end
    end)

    :ok
  end
end
mneudert commented 6 years ago

If you are adventurous you can try something I hacked together by pointing your dependency to a branch:

defp deps do
  [
    # ...
    {:geolix, "https://github.com/elixir-geolix/geolix.git", branch: "background-remote-load"}
    # ...
  ]
end

It seemed to work without any timeouts or problems (with a known good configuration). The remote databases did not turn up in the lookup results until the load was completed in the background.

For testing you can check the state of the load process manually:

GenServer.call(Geolix.Database.Loader, {:get_database, :city})

The map field :state should be :delayed while the loading process is not finished and then change to :loaded and give proper results.

mneudert commented 6 years ago

You could also try the new ets based state handling that has just landed on the master branch.

It seems to work as it broke the verification tests due to the no longer blocking state lookup. Apparently well enough that even local files where loaded not fast enough for the first lookup to complete with an empty result...

mneudert commented 6 years ago

I consider this solved after the new version has been released with the asynchronous loading state. If any problems arise please comment here or open another issue.