eclipse / paho.mqtt.rust

paho.mqtt.rust
Other
516 stars 102 forks source link

create more than 1000 client cause segmentation fault #143

Closed gensmusic closed 1 year ago

gensmusic commented 2 years ago

I try to create more than 1000 clients and then got segmentation fault

image image

How to reproduce ?

Use the following code!

Cargo.toml

[dependencies]
anyhow = "1"
thiserror = "1"
tokio = { version = "1", features = ["full"] }

paho-mqtt = {version = "0.9" }
futures = "0.3"
use anyhow::{Context, Result};
use paho_mqtt as mqtt;
use paho_mqtt::{AsyncClient, PersistenceType};

#[tokio::main]
async fn main() -> Result<()> {
    let mqtt_address = "127.0.0.1:1883";
    let mqtt_username = "";
    let mqtt_password = "";

    let mut handles = vec![];
    for i in 0..1100 {
        let client_id = format!("client_id_{}", i);
        let task = tokio::spawn(async move {
            let res = create_client(mqtt_address, &client_id, mqtt_username, mqtt_password).await?;
            Ok::<AsyncClient, anyhow::Error>(res)
        });
        handles.push(task);
    }
    let mut clients = vec![];
    for v in handles {
        match v.await.unwrap() {
            Ok(client) => {
                clients.push(client);
            }
            Err(err) => {
                println!("create client got err: {:?}", err);
            }
        }
    }

    Ok(())
}

async fn create_client(
    mqtt_address: &str,
    client_id: &str,
    username: &str,
    password: &str,
) -> Result<AsyncClient> {
    let opts = mqtt::CreateOptionsBuilder::new()
        .persistence(PersistenceType::None)
        .server_uri(mqtt_address)
        .client_id(client_id)
        .finalize();
    let client = mqtt::AsyncClient::new(opts).context("create mqtt client err")?;

    let opts = mqtt::ConnectOptionsBuilder::new()
        .user_name(username)
        .password(password)
        .finalize();
    let _res = client.connect(opts).await.context("mqtt connect err")?;

    let msg = mqtt::Message::new("/a/b/c", "hello!", 1);

    client.publish(msg).await?;

    Ok(client)
}
fpagliughi commented 2 years ago

Thanks for reporting this. I will take a look and see if I can reproduce it.

fpagliughi commented 2 years ago

BTW... thanks for the full code to test! Very helpful.

At first I couldn't reproduce the error, then realized that I was getting some connect failures due to my 1024 socket limit on Linux. I increased that by 2x and got the segfault.

I ran the code through valgrind but didn't see any memory leaks (which was my first guess).

Then ran it through a debugger, and it appears to be crashing in the underlying Paho C client library on a call to a socket close function. Odd. I don't see any hard table limits for 1000 clients in the C code, but it could be something like that.

Unfortunately, we can't throw an async/tokio Rust app at the C guys to debug, so I probably need to write a similar test of >1000 clients in pure C and see if I can cause the same segfault and post an issue to that project.

I'm sure the first question will be... "Who needs 1000 separate clients in an app?!?!"

fpagliughi commented 2 years ago

Oh, I just saw this... https://github.com/eclipse/paho.mqtt.c/issues/1033

So I'm assuming that >1000 clients will specifically not be supported by the C client, but certainly it shouldn't segfault.

fpagliughi commented 1 year ago

With the upgrade to Paho C v1.3.12, this is no longer crashing. Depending on the system and pre-set limits, you may not be able to open that many connections, but at least it's no longer crashing.