elixir-mongo / mongodb_ecto

MongoDB adapter for Ecto
Apache License 2.0
370 stars 126 forks source link

:nxdomain, :econnrefused errors #142

Closed justmark closed 7 years ago

justmark commented 7 years ago

I have a phoenix 1.3 application running (lets call this system A), which is connected to a replica set. All is running great.

I needed to migrate the application to another host (system B). It is a brand new 16.04 LTS build. Upon starting up the application, I get a Mongo.Error popping up.

[error] Mongo.Protocol (#PID<0.606.0>) failed to connect: ** (Mongo.Error) tcp connect: non-existing domain - :nxdomain
[error] Mongo.Protocol (#PID<0.604.0>) failed to connect: ** (Mongo.Error) tcp connect: connection refused - :econnrefused

IP Tables are in use, and are configured properly. I can use Mongo on system B to connect to the replica set.

My dev.exs:

# Configure your database
config :titan, MyApp.Repo,
  adapter: Mongo.Ecto,
  database: "my_db",
  username: "my_username",
  password: "my_pass",
  hostname: "a.b.c.d"

Deps:

  defp deps do
    [
      {:phoenix, "~> 1.3.0"},
      {:phoenix_pubsub, "~> 1.0"},
      {:phoenix_ecto, "~> 3.2"},
      {:mongodb_ecto, git: "https://github.com/ankhers/mongodb_ecto.git", branch: "ecto-2.1"},
      {:phoenix_html, "~> 2.10"},
      {:phoenix_live_reload, "~> 1.0", only: :dev},
      {:gettext, "~> 0.11"},
      {:cowboy, "~> 1.0"},
      {:coherence,  github: "smpallen99/coherence"},
      {:logster, "~> 0.3.0", override: true}
    ]
  end

Anyone got any suggestions as to what might be causing this?

ankhers commented 7 years ago

:nxdomain would be from your DNS server saying that whatever domain you are trying to connect to does not exist.

When you say you can use Mongo to connect to the replica set from System B, what driver are you talking about? The Elixir driver, the command line, some other language?

justmark commented 7 years ago

I was testing using Mongo's shell. ie:

mongo --host ip_address

to confirm that it was reachable (and it is.) Why would I be getting an error trying to connect to a domain when I'm using an IP address? Anyway, I swapped out the IP address for the actual domain name, and the :nxdomain error is gone, but the :econnrefused remains.

ankhers commented 7 years ago

Unfortunately I am unsure as to why your system was giving you nxdomain.

As for the :econnrefused, you are going to have to take a look at your Mongo server logs to figure out why it is not allowing the connection.

justmark commented 7 years ago

Well, to add a bit more confusion, the Mongo logs show that the connections are being accepted, and then the connections get terminated. No errors.

2017-10-02T22:15:09.267+0000 I ACCESS   [conn41462] Successfully authenticated as principal db_user on db
2017-10-02T22:15:25.273+0000 I NETWORK  [thread1] connection accepted from my_ip:42993 #41463 (98 connections now open)
2017-10-02T22:15:25.335+0000 I NETWORK  [thread1] connection accepted from my_ip:41655 #41464 (99 connections now open)
2017-10-02T22:15:25.343+0000 I NETWORK  [thread1] connection accepted from my_ip:39273 #41465 (100 connections now open)
2017-10-02T22:15:25.401+0000 I ACCESS   [conn41464] Successfully authenticated as principal db_user on db
2017-10-02T22:15:25.402+0000 I -        [conn41464] end connection my_ip:41655 (100 connections now open)
2017-10-02T22:15:25.402+0000 I -        [conn41463] end connection my_ip:42993 (99 connections now open)
ankhers commented 7 years ago

This sounds like one of the members setup in the replica set is not actually running. So when the monitor process attempts to connect, it is unable to. I should add better error messages to the mongodb package that will allow you to know which server(s) the driver is unable to connect to.

For the time being, you should be able to add an IO.inspect call inside the Mongo.Topology module in order to look at the state and match up the PID to the server. Or you can use Observer to look at the state of the process.

justmark commented 7 years ago

I've added this into the Topology module as you suggested. Looks as though it is trying to connect to localhost:27017 even though I have servers defined. I'll keep you posted.

justmark commented 7 years ago

Ok, got it solved. Adding in additional logging into Mongo.Topology showed the problem very quickly. I originally had the IP address in the hostname field, but ran into issues, so attempted to use the url/mongodb string instead. This somehow got translated to localhost (unsure why.) When I switched back to the IP address of the primary, 5 hosts appeared (I only have 3.) Two were duplicates of each other, and of those, 1 of them was using the internal hostname for the 3rd replica set, which wasn't defined on this new system. Adding that into the hosts file resolved the issue.

Whats odd is that 5 showed up, and of those 3 replicas, only 1 of them translated to the internally defined hostname even though all 3 have these defined in their hosts file.

nitinjain1105 commented 5 years ago

I have a similar issue where I see Mongo is disconnecting quite often. I don't even have replica sets, I have just one Mongo instance running on my machine at port 27017.

I see the below error, please help:

"Mongo.Protocol", 32, 40, "#PID<0.547.0>", ") failed to connect: " | "** (Mongo.Error) tcp recv: unknown POSIX error - :closed"

or

"Mongo.Protocol", 32, 40, "#PID<0.606.0>", ") disconnected: " | "** (Mongo.Error) tcp send: unknown POSIX error - :closed"

ankhers commented 5 years ago

@nitinjain1105 That sounds like a completely different issue. Please create a new ticket.