eclipse / mosquitto

Eclipse Mosquitto - An open source MQTT broker
https://mosquitto.org
Other
8.92k stars 2.37k forks source link

Ocasionally when trying to connect the broker gives "Socket error on client <unknown>, disconnecting." #1247

Open sogartar opened 5 years ago

sogartar commented 5 years ago

There seems to be several related problems (#523, #1023, #7, #1197) but I am not sure, so I am posting a new issue.

I have a simple test where

  1. mosquitto broker is started.
  2. client connects.
  3. client disconnects.

Once every few 100 attempts I get from the broker

Socket error on client <unknown>, disconnecting.

I am using v1.6 and the C++ interface in mosquittopp. Below is the test code.

BOOST_AUTO_TEST_CASE(MosquittoSocketErrTest) {
  // start broker
  int port = 14587;
  std::string mosquitto_cmd =
      (boost::format("mosquitto --verbose --port %d") % port).str();
  int mosquitto_pid = popen(mosquitto_cmd.c_str(), nullptr, nullptr, nullptr);
  BOOST_REQUIRE_GE(mosquitto_pid, 0);

  // wait for broker to start
  sleep(1);

  // Kill broker in destructor
  auto finalizer = MakeFinalzer([&mosquitto_pid]() {
    kill(mosquitto_pid, SIGTERM);
    siginfo_t infop;
    int waitid_status = waitid(P_PID, mosquitto_pid, &infop, WEXITED);
    BOOST_REQUIRE_EQUAL(waitid_status, 0);
  });

  BOOST_REQUIRE_EQUAL(mosqpp::strerror(mosqpp::lib_init()),
                      mosqpp::strerror(MOSQ_ERR_SUCCESS));

  MqttClient client;
  client.loop_start();
  int subscriber_connect_status = client.connect("localhost", port);
  BOOST_REQUIRE_EQUAL(mosqpp::strerror(subscriber_connect_status),
                      mosqpp::strerror(MOSQ_ERR_SUCCESS));

  // Fails even with this sleep
  //  sleep(1);

  client.disconnect();
  client.loop_stop();
}

The console output is.

1556542153: mosquitto version 1.6.0 starting
1556542153: Config loaded from /tmp/mosquitto-BmwhCm.cfg.
1556542153: Opening ipv4 listen socket on port 14587.
1556542153: Opening ipv6 listen socket on port 14587.
1556542155: New connection from 127.0.0.1 on port 14587.
2019-04-29T15:49:15.838 default debug [] Client mosq/egmSyyfyxRx8Z5Tk7n sending CONNECT
2019-04-29T15:49:15.839 default debug [] Client mosq/egmSyyfyxRx8Z5Tk7n sending DISCONNECT
1556542156: Socket error on client <unknown>, disconnecting.
1556542156: mosquitto version 1.6.0 terminating

I thought that the problem may be that disconnect is being sent too quickly, but that is not it. The output then is.

1556544100: mosquitto version 1.6.0 starting
1556544100: Config loaded from /tmp/mosquitto-C38s3u.cfg.
1556544100: Opening ipv4 listen socket on port 14587.
1556544100: Opening ipv6 listen socket on port 14587.
1556544101: New connection from 127.0.0.1 on port 14587.
2019-04-29T16:21:41.659 default debug [] Client mosq/cUp7MjwO4KfHzSRARJ sending CONNECT
1556544102: Socket error on client <unknown>, disconnecting.
1556544102: New connection from 127.0.0.1 on port 14587.
2019-04-29T16:21:42.658 default debug [] Client mosq/cUp7MjwO4KfHzSRARJ sending CONNECT
1556544102: New client connected from 127.0.0.1 as mosq/cUp7MjwO4KfHzSRARJ (p2, c1, k60).
1556544102: No will message specified.
1556544102: Sending CONNACK to mosq/cUp7MjwO4KfHzSRARJ (0, 0)
2019-04-29T16:21:42.659 default debug [] Client mosq/cUp7MjwO4KfHzSRARJ received CONNACK (0)
2019-04-29T16:21:42.659 default debug [] Client mosq/cUp7MjwO4KfHzSRARJ sending DISCONNECT
1556544102: Received DISCONNECT from mosq/cUp7MjwO4KfHzSRARJ
1556544102: Client mosq/cUp7MjwO4KfHzSRARJ disconnected.
1556544102: mosquitto version 1.6.0 terminating

As you can see the client attempts the connection again if it fails.


I have another more complex test where publishing and subscribing are involved also. There I get the error more often.

ralight commented 5 years ago

Do you think you could you try and capture wireshark trace of when this happens?

4nte commented 4 years ago

My Mosquitto broker instance has been running in the cluster for 178 days straight and worked fine. Today, suddenly it decided to repeatedly reject any client connections.

1574339562: Socket error on client pipeline-22, disconnecting.
1574339562: Socket error on client 0b605f7a-c010-4647-bf47-7265c0be84ac1574338670367, disconnecting.
1574339562: Socket error on client ESP32_cc2808, disconnecting.
1574339562: Socket error on client scheduler-service-46, disconnecting.
1574339562: Socket error on client ESP32_cc31CC, disconnecting.
1574339562: Socket error on client 417512ea-18a4-4494-819c-d61032c3ddb51574333357250, disconnecting.
1574339562: Socket error on client locator-processor13-21, disconnecting.
1574339562: Socket error on client locator-api3-31, disconnecting.

It started behaving this way relatively after I made an unrelated change in my kubernetes configuration: I exposed some unrelated TCP service through my Nginx Ingress. The only way that unrelated change might have affected the mqtt broker is that it caused the nginx ingress to restart, which caused all mqtt client connections to close. Soon after nginx booted up, clients started re-connecting but this time unsuccessfully with broker throwing Socket error on client X, disconnecting.

Edit: I managed to get the broker not to disconnect a client by supplying unique username/password on the client. This frightens me since credentials should not play any role when authentication is disabled.

stangerm2 commented 4 years ago

Hi, I'm seeing something very similar using mosquittopp 1.6.4 on ARM, localhost connections:

this->loop_start();
this->connect();

while (x5) {this->publish (...);}

1582069468: Sending CONNACK to auth_manager (0, 0) 1582069468: Received PUBLISH from auth_manager (d0, q1, r1, m1, 'device/auth', ... (495 bytes)) 1582069468: Sending PUBACK to auth_manager (m1, rc0) 1582069468: Sending PUBLISH to mosq/zD3DRWvoOKp2e4oeSb (d0, q0, r0, m0, 'device/auth', ... (495 bytes)) 1582069468: Received DISCONNECT from mosq/zD3DRWvoOKp2e4oeSb

The big concern for me is I have on_disconnect set and it does not fire with an error. The first message always fires, the next 4 usually do. When it fail's the last 4 all fail to pub, but the first message always get's pub'd.

I'd say this is more like 1/20 for me.

kotu-pl commented 3 years ago

From time to time, I have the same Socket error on client <SOME_NAME>, disconnecting. issue for some of my devices (anonymous connections, all devices have status IPs)

As all of my devices are exactly the same (hardware and software except devise's name passed to MQTT) I can observe that the issue occurs more often on devices that stand further from my router (but I cannot replicate it when taking one of them with a power bank power supply of the range of the router).

From my perspective, another, noticeable, issue is that after the Socket error on client(...) last will message is not triggered (it would allow me to force restart the device).

Any ideas if it's possible to fetch "last will" on such kind of errors?

ralight commented 3 years ago

@kotu-pl What version of Mosquitto are you using? Are you absolutely sure that a Will message is being set on your client? If you are able to provide a wireshark/tcpdump trace of the broker we may be able to figure out what is happening.

sadamchandra commented 2 months ago

the same issue was observed in my setup also I am using mosquitto-1.5.5 getting below log continuously, Socket error on client disconnecting

thanks, Chandra