awslabs / aws-crt-nodejs

NodeJS bindings for the AWS Common Runtime.
Apache License 2.0
38 stars 24 forks source link

Unsubscribing from MQTT topic quickly may cause SEGV #405

Closed samuelplexcelon closed 1 year ago

samuelplexcelon commented 1 year ago

It seems that subscribing and unsubscribing too quickly causes a SEGV to happen.

I have a server handling MQTT subscriptions based on whether a websocket is open or closed. The front end app is responsible for opening a websocket chain that eventually connects to this server. If a user keeps on opening and closing the sockets, eventually one of the attempts will cause the unsubscribe method MqttConnection to segfault.

The opening and closing of the websockets is not done through a script, I can replicate this by hand in a browser.

(I will provide the log of the debug build later, but if remember correctly I believe it was line 497 of aws-c-mqtt/source/topic_tree.c)

Production log:

Jan 25 20:21:07: Web socket server connection
Jan 25 20:21:07: We connected to the central server!
Jan 25 20:21:07: releasing port: 1234
Jan 25 20:21:07: Forcing port release
Jan 25 20:21:07: Central connection socket once message
Jan 25 20:21:07: [SN: [SN: a5f40260-8ee3-41d5-a275-737f96e4e38f]] CENTRAL SERVER: {"deviceName":"CD-118"}
Jan 25 20:21:07: Subscribing to shadow client
Jan 25 20:21:07: Subscribed to mqtt Not yet unsubscribed
Jan 25 20:21:07: [SN: CD-118] Trying with IP 1.2.3.4 and Port 1234
Jan 25 20:21:07: Subscription message in!
Jan 25 20:21:07: [SN: CD-118] Disconnected from central server. Stopping active script.
Jan 25 20:21:07: Disconnect info:
Jan 25 20:21:07: Code: 4300
Jan 25 20:21:07: Reason: 'Client requested close'
Jan 25 20:21:07: [SN: CD-118] Stopping everything!
Jan 25 20:21:07: [SN: CD-118] Stopping active script
Jan 25 20:21:07: Unable to stop everything: TypeError: Cannot read properties of null (reading 'StopActiveScript')
Jan 25 20:21:07: at /var/app/current/index.js:441:19
Jan 25 20:21:07: at new Promise (<anonymous>)
Jan 25 20:21:07: at stopActiveScript (/var/app/current/index.js:440:9)
Jan 25 20:21:07: at /var/app/current/index.js:458:3
Jan 25 20:21:07: at new Promise (<anonymous>)
Jan 25 20:21:07: at stopEverything (/var/app/current/index.js:457:9)
Jan 25 20:21:07: at WebSocket.<anonymous> (/var/app/current/index.js:282:4)
Jan 25 20:21:07: at WebSocket.emit (node:events:525:35)
Jan 25 20:21:07: at WebSocket.emitClose (/var/app/current/node_modules/ws/lib/websocket.js:258:10)
Jan 25 20:21:07: at Socket.socketOnClose (/var/app/current/node_modules/ws/lib/websocket.js:1260:15)
Jan 25 20:21:07: [SN: CD-118] Failed to stop everything
Jan 25 20:21:07: 1
Jan 25 20:21:07: 2 CD-118
Jan 25 20:21:07: [SN: CD-118] Trying to unsubscribe. Subscribed
Jan 25 20:21:07: /tmp/start-dae05cf3.sh: line 1: 19571 Segmentation fault      node index.js

server.js:

webSocketServer.on('connection', async (centralConnectionSocket) => {
    let isSubscribed = false;

    console.log('Web socket server connection');

    const UUID = uuid();
    let logTag = `[SN: ${UUID}]`;
    let deviceName = '';

/** Then we subscribe, set deviceName, and other things **/

        centralConnectionSocket.on('close', (code, reason = '[empty]') => {
            console.log(logTag, `Disconnected from central server. Stopping active script.\nDisconnect info:\nCode: ${code}\nReason: '${reason}'`);
            // If we disconnect from the central server, end the test or any active scripts/instructions
            stopEverything()
            .catch(() => {
                console.log(logTag, 'Failed to stop everything');
            })
            .finally(async ()=>{
                try {
                    try {
                        console.log(1);
                        const shadowClient = await AwsShadowClient.factory();
                        console.log(2, deviceName);

                        // Unsub to clear the previous disconnect and error handlers
                        console.log(logTag, 'Trying to unsubscribe.', isSubscribed ? 'Subscribed' : 'Not yet subscribed');
/* This is where we fail =>*/  await shadowClient.unsubscribe(deviceName);
                        console.log(logTag, 'MQTT UNSUB');

                        await shadowClient.publish(deviceName, {
                            state: {
                                desired: {
                                    uuid: UUID,
                                }
                            }
                        });

                    }
                    catch (error) {
                        console.log(logTag, 'Failed in MQTT workflow. Forcing port release anyway.\n', error.message);
                    }
                    finally {
                        releasePort().catch(error => {
                            console.log('Failed to release port', error);
                        });
                    }
                }
                finally {
                        // Do some more clean up
                    }
                }
            });
        });
})

aws-shadow-client.js:

/* ... Setup credentials and decoder */

 class AwsShadowClient {
    static #mqttClient = mqttMqttClient();
    static #building = false;
    static #runningClient = null;

    #pubClient = null;
    #subClient = null;

    #QoS = 1;

    static async factory() {
        try {
            // We only want to make these connections once. We cache these and return them if we have a running client already
            if (AwsShadowClient.#runningClient !== null) return AwsShadowClient.#runningClient;

            // This is set to allow the constructor to tell when we're calling it from the factory method
            // Also, if JS was multi-threaded, this would not be safe
            AwsShadowClient.#building = true;

            // We try this 3 times in case of network wonkiness. The number 3 is arbitrary
            for (let tries = 0; tries < 3; tries++) {
                try {
                    const
                        pubClient = AwsShadowClient.#mqttClient.new_connection(pubConfig),
                        subClient = AwsShadowClient.#mqttClient.new_connection(subConfig);

                    await pubClient.connect();
                    await subClient.connect();
                    AwsShadowClient.#runningClient = new AwsShadowClient(pubClient, subClient);

                } catch (er) {
                    console.error('AWS SHADOW CLIENT ERROR:\n', er);
                }
                finally {
                    AwsShadowClient.#building = false;
                    // This return overrides all other returns
                    // The One Return to rule them all
                    return AwsShadowClient.#runningClient;
                }
            }
        }
        catch (error) {
            console.log('Could not run AwsShadowClient factory', error);
            throw error;
        }
    }

    constructor(pubClient, subClient) {
        if (!AwsShadowClient.#building)
            throw new Error('Do not intantiate this class directly. User static method `factory`.');

        this.#pubClient = pubClient;
        this.#subClient = subClient;

    }

    subscribe(serialNumber, callback, qos = this.#QoS) {
        try {
            return this.#subClient.subscribe(
                `$aws/things/${serialNumber}/shadow/update/accepted`,
                qos,
                (topic, message, dup, qos, retain)=> {
                    message = JSON.parse(textDecoder.decode(message));
                    callback(message, topic);
                }
            );
        } catch(error) {
            console.log('Could not set up AwsShadowClient subscribe.', error);
            throw error;
        }
    }

    publish(serialNumber, payload, qos = this.#QoS) {
        try {
            return this.#pubClient.publish(
                `$aws/things/${serialNumber}/shadow/update`,
                payload,
                qos
            );
        } catch(error) {
            console.log('Could not publish AwsShadowClient', error);
            throw error;
        }
    }

    unsubscribe(serialNumber) {
        try {
            return this.#subClient.unsubscribe(`$aws/things/${serialNumber}/shadow/update`);
        }
        catch (error) {
            console.log('Could not unsubscribe AwsShadowClient', error);
            throw error;
        }
    }
};
TwistedTwigleg commented 1 year ago

Thanks for making this issue! I just wanted to quickly write that I am taking a look into this to try and figure out what is going on.

samuelplexcelon commented 1 year ago

I've been running a local test server on a raspi 3, and got this just now:

[SN: CD-118] Failed to stop everything
1
2 CD-118
[SN: CD-118] Trying to unsubscribe. Subscribed
Fatal error condition occurred in /aws-crt-nodejs/crt/aws-c-common/source/allocator.c:210: allocator->mem_release != ((void *)0)
Exiting Application
No call stack information available

I'm not sure if it's the same error I get in the EC2, but I figured it might be related.

TwistedTwigleg commented 1 year ago

So far I have been unable to reproduce this issue, but there's a few things I'm unsure on:


With the allocator issue, that code triggers when the function to free the memory in the allocator is not defined. My hunch is that something is being freed more than once, causing the issue.

If you are testing on the desktop and not the browser, then enabling logging may help with figuring out what is going on. You can enable logging using (reference):

 io.enable_logging(io.LogLevel["TRACE"])
samuelplexcelon commented 1 year ago

I apologize for the radio silence.

I'm establishing credentials with websocket like this:

const { mqtt, iot, auth } = require('aws-crt');

const credentials_provider = auth.AwsCredentialsProvider.newStatic(AWS_ACCESS_KEY_ID_MQTT, AWS_SECRET_ACCESS_KEY_MQTT);

const pubConfig = iot.AwsIotMqttConnectionConfigBuilder.new_with_websockets({
        region: AWS_REGION,
        credentials_provider,
    })
        .with_client_id(`SOME-RANDOM-STRING`)
        .with_endpoint(AWS_IOT_ENDPOINT)
        .with_clean_session(true)
        .build();

I got lucky replicating this first try with a pretty clean log on the debug build, but I'm not sure how helpful it is:


Mar 30 22:27:49 ip-172-31-93-174 web: Web socket server connection
Mar 30 22:27:49 ip-172-31-93-174 web: We connected to the central server!
Mar 30 22:27:49 ip-172-31-93-174 web: Forcing port release
Mar 30 22:27:49 ip-172-31-93-174 web: Central connection socket once message
Mar 30 22:27:49 ip-172-31-93-174 web: [SN: [SN: 36fad161-9e10-468b-8772-b699ce69aa66]] CENTRAL SERVER: {"deviceName":"##-###"}
Mar 30 22:27:49 ip-172-31-93-174 web: Subscribing to shadow client
Mar 30 22:27:49 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:27:49Z] [00007f77a8a7e700] [task-scheduler] - id=0x7f778402baf0: Scheduling schedule_cross_thread_tasks task for immediate execution
Mar 30 22:27:49 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:27:49Z] [00007f77a8a7e700] [task-scheduler] - id=0x7f778402baf0: Running schedule_cross_thread_tasks task with <Running> status
Mar 30 22:27:49 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:27:49Z] [00007f77a8a7e700] [mqtt-topic-tree] - tree=0x5a045f8: Inserting topic filter $aws/things/##-###/shadow/update/accepted into topic tree
Mar 30 22:27:49 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:27:49Z] [00007f77a8a7e700] [websocket] - id=0x7f778402da70: Enqueuing outgoing frame with opcode=2(binary) length=48 fin=T
Mar 30 22:27:49 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:27:49Z] [00007f77a8a7e700] [task-scheduler] - id=0x7f778402dae8: Scheduling websocket_move_synced_data_to_thread task for immediate execution
Mar 30 22:27:49 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:27:49Z] [00007f77a8a7e700] [task-scheduler] - id=0x7f778402dae8: Running websocket_move_synced_data_to_thread task with <Running> status
Mar 30 22:27:49 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:27:49Z] [00007f77a8a7e700] [task-scheduler] - id=0x7f778402b7f0: Scheduling socket_written_task task for immediate execution
Mar 30 22:27:49 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:27:49Z] [00007f77a8a7e700] [task-scheduler] - id=0x7f778402b7f0: Running socket_written_task task with <Running> status
Mar 30 22:27:49 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:27:49Z] [00007f77ad9cd7c0] [mqtt-client] - id=0x5a04380: Starting subscribe 9 on topic $aws/things/##-###/shadow/update/accepted
Mar 30 22:27:49 ip-172-31-93-174 web: stdout:
Mar 30 22:27:49 ip-172-31-93-174 web: stderr:
Mar 30 22:27:49 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:27:49Z] [00007f77a8a7e700] [mqtt-client] - id=0x5a04380: Subscribe 9 completed with error code 0
Mar 30 22:27:49 ip-172-31-93-174 web: Subscribed to mqtt Not yet unsubscribed
Mar 30 22:27:49 ip-172-31-93-174 web: [SN: ##-###] Trying with IP 1.2.3.4 and Port 1234
Mar 30 22:27:49 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:27:49Z] [00007f77a8a7e700] [task-scheduler] - id=0x7f7784001bb0: Scheduling schedule_cross_thread_tasks task for immediate execution
Mar 30 22:27:49 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:27:49Z] [00007f77a8a7e700] [task-scheduler] - id=0x7f7784001bb0: Running schedule_cross_thread_tasks task with <Running> status
Mar 30 22:27:49 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:27:49Z] [00007f77a8a7e700] [websocket] - id=0x7f7784016e20: Enqueuing outgoing frame with opcode=2(binary) length=188 fin=T
Mar 30 22:27:49 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:27:49Z] [00007f77a8a7e700] [task-scheduler] - id=0x7f7784016e98: Scheduling websocket_move_synced_data_to_thread task for immediate execution
Mar 30 22:27:49 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:27:49Z] [00007f77a8a7e700] [task-scheduler] - id=0x7f7784016e98: Running websocket_move_synced_data_to_thread task with <Running> status
Mar 30 22:27:49 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:27:49Z] [00007f77a8a7e700] [task-scheduler] - id=0x7f7784000f30: Scheduling socket_written_task task for immediate execution
Mar 30 22:27:49 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:27:49Z] [00007f77a8a7e700] [task-scheduler] - id=0x7f7784000f30: Running socket_written_task task with <Running> status
Mar 30 22:27:49 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:27:49Z] [00007f77ad9cd7c0] [mqtt-client] - id=0x5a01bb0: Starting publish 9 to topic $aws/things/##-###/shadow/update
Mar 30 22:27:49 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:27:49Z] [00007f77a8a7e700] [mqtt-client] - id=0x5a01bb0: Publish 9 complete
Mar 30 22:27:49 ip-172-31-93-174 web: Subscription message in!
Mar 30 22:27:49 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:27:49Z] [00007f77a8a7e700] [websocket] - id=0x7f778402da70: Enqueuing outgoing frame with opcode=2(binary) length=4 fin=T
Mar 30 22:27:49 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:27:49Z] [00007f77a8a7e700] [task-scheduler] - id=0x7f778402dae8: Scheduling websocket_move_synced_data_to_thread task for immediate execution
Mar 30 22:27:49 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:27:49Z] [00007f77a8a7e700] [task-scheduler] - id=0x7f778402dae8: Running websocket_move_synced_data_to_thread task with <Running> status
Mar 30 22:27:49 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:27:49Z] [00007f77a8a7e700] [task-scheduler] - id=0x7f778402b7f0: Scheduling socket_written_task task for immediate execution
Mar 30 22:27:49 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:27:49Z] [00007f77a8a7e700] [task-scheduler] - id=0x7f778402b7f0: Running socket_written_task task with <Running> status
Mar 30 22:27:49 ip-172-31-93-174 web: [SN: ##-###] Disconnected from central server. Stopping active script.
Mar 30 22:27:49 ip-172-31-93-174 web: Disconnect info:
Mar 30 22:27:49 ip-172-31-93-174 web: Code: 4300
Mar 30 22:27:49 ip-172-31-93-174 web: Reason: 'Client requested close'
Mar 30 22:27:49 ip-172-31-93-174 web: [SN: ##-###] Stopping everything!
Mar 30 22:27:49 ip-172-31-93-174 web: [SN: ##-###] Stopping active script
Mar 30 22:27:49 ip-172-31-93-174 web: Unable to stop everything: TypeError: Cannot read properties of null (reading 'StopActiveScript')
Mar 30 22:27:49 ip-172-31-93-174 web: at /var/app/current/websocket-server.js:421:19
Mar 30 22:27:49 ip-172-31-93-174 web: at new Promise (<anonymous>)
Mar 30 22:27:49 ip-172-31-93-174 web: at stopActiveScript (/var/app/current/websocket-server.js:420:9)
Mar 30 22:27:49 ip-172-31-93-174 web: at /var/app/current/websocket-server.js:456:3
Mar 30 22:27:49 ip-172-31-93-174 web: at new Promise (<anonymous>)
Mar 30 22:27:49 ip-172-31-93-174 web: at stopEverything (/var/app/current/websocket-server.js:455:9)
Mar 30 22:27:49 ip-172-31-93-174 web: at WebSocket.<anonymous> (/var/app/current/websocket-server.js:262:4)
Mar 30 22:27:49 ip-172-31-93-174 web: at WebSocket.emit (node:events:525:35)
Mar 30 22:27:49 ip-172-31-93-174 web: at WebSocket.emitClose (/var/app/current/node_modules/ws/lib/websocket.js:258:10)
Mar 30 22:27:49 ip-172-31-93-174 web: at Socket.socketOnClose (/var/app/current/node_modules/ws/lib/websocket.js:1260:15)
Mar 30 22:27:49 ip-172-31-93-174 web: [SN: ##-###] Failed to stop everything
Mar 30 22:27:49 ip-172-31-93-174 web: 1
Mar 30 22:27:49 ip-172-31-93-174 web: 2 ##-###
Mar 30 22:27:49 ip-172-31-93-174 web: [SN: ##-###] Trying to unsubscribe. Subscribed
Mar 30 22:27:49 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:27:49Z] [00007f77ad9cd7c0] [mqtt-client] - id=0x5a04380: Starting unsubscribe 10
Mar 30 22:27:49 ip-172-31-93-174 web: /tmp/start-d0af7a53.sh: line 1: 30833 Segmentation fault      node index.js

I will post the log of TRACE level in a bit

samuelplexcelon commented 1 year ago

Here is a trace level log:

Mar 30 22:48:12 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:48:12Z] [00007fe965189700] [task-scheduler] - id=0x7fe95402dae8: Scheduling websocket_move_synced_data_to_thread task for immediate execution
Mar 30 22:48:12 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:48:12Z] [00007fe965189700] [task-scheduler] - id=0x7fe95402dae8: Running websocket_move_synced_data_to_thread task with <Running> status
Mar 30 22:48:12 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:48:12Z] [00007fe965189700] [task-scheduler] - id=0x7fe95402b7f0: Scheduling socket_written_task task for immediate execution
Mar 30 22:48:12 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:48:12Z] [00007fe965189700] [task-scheduler] - id=0x7fe95402b7f0: Running socket_written_task task with <Running> status
Mar 30 22:48:19 ip-172-31-93-174 web: 1.2.3.4
Mar 30 22:48:19 ip-172-31-93-174 web: ::ffff:127.0.0.1 - - [30/Mar/2023:22:48:19 +0000] "GET / HTTP/1.1" 200 -
Mar 30 22:48:19 ip-172-31-93-174 web: 1.2.3.4
Mar 30 22:48:19 ip-172-31-93-174 web: ::ffff:127.0.0.1 - - [30/Mar/2023:22:48:19 +0000] "GET / HTTP/1.1" 200 -
Mar 30 22:48:20 ip-172-31-93-174 web: 1.2.3.4
Mar 30 22:48:20 ip-172-31-93-174 web: ::ffff:127.0.0.1 - - [30/Mar/2023:22:48:20 +0000] "GET / HTTP/1.1" 200 -
Mar 30 22:48:34 ip-172-31-93-174 web: 1.2.3.4
Mar 30 22:48:34 ip-172-31-93-174 web: ::ffff:127.0.0.1 - - [30/Mar/2023:22:48:34 +0000] "GET / HTTP/1.1" 200 -
Mar 30 22:48:34 ip-172-31-93-174 web: 1.2.3.4
Mar 30 22:48:34 ip-172-31-93-174 web: ::ffff:127.0.0.1 - - [30/Mar/2023:22:48:34 +0000] "GET / HTTP/1.1" 200 -
Mar 30 22:48:37 ip-172-31-93-174 web: Web socket server connection
Mar 30 22:48:37 ip-172-31-93-174 web: We connected to the central server!
Mar 30 22:48:37 ip-172-31-93-174 web: releasing port: 1234
Mar 30 22:48:37 ip-172-31-93-174 web: Forcing port release
Mar 30 22:48:37 ip-172-31-93-174 web: Central connection socket once message
Mar 30 22:48:37 ip-172-31-93-174 web: [SN: [SN: c36f91a4-8318-484f-971a-c3e77fbe0dc0]] CENTRAL SERVER: {"deviceName":"##-###"}
Mar 30 22:48:37 ip-172-31-93-174 web: Subscribing to shadow client
Mar 30 22:48:37 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:48:37Z] [00007fe965189700] [task-scheduler] - id=0x7fe95402baf0: Scheduling schedule_cross_thread_tasks task for immediate execution
Mar 30 22:48:37 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:48:37Z] [00007fe965189700] [task-scheduler] - id=0x7fe95402baf0: Running schedule_cross_thread_tasks task with <Running> status
Mar 30 22:48:37 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:48:37Z] [00007fe965189700] [mqtt-topic-tree] - tree=0x6c8f218: Inserting topic filter $aws/things/##-###/shadow/update/accepted into topic tree
Mar 30 22:48:37 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:48:37Z] [00007fe965189700] [websocket] - id=0x7fe95402da70: Enqueuing outgoing frame with opcode=2(binary) length=48 fin=T
Mar 30 22:48:37 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:48:37Z] [00007fe965189700] [task-scheduler] - id=0x7fe95402dae8: Scheduling websocket_move_synced_data_to_thread task for immediate execution
Mar 30 22:48:37 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:48:37Z] [00007fe965189700] [task-scheduler] - id=0x7fe95402dae8: Running websocket_move_synced_data_to_thread task with <Running> status
Mar 30 22:48:37 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:48:37Z] [00007fe965189700] [task-scheduler] - id=0x7fe95402b7f0: Scheduling socket_written_task task for immediate execution
Mar 30 22:48:37 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:48:37Z] [00007fe965189700] [task-scheduler] - id=0x7fe95402b7f0: Running socket_written_task task with <Running> status
Mar 30 22:48:37 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:48:37Z] [00007fe97645d7c0] [mqtt-client] - id=0x6c8efa0: Starting subscribe 3 on topic $aws/things/##-###/shadow/update/accepted
Mar 30 22:48:37 ip-172-31-93-174 web: stdout:
Mar 30 22:48:37 ip-172-31-93-174 web: stderr:
Mar 30 22:48:37 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:48:37Z] [00007fe965189700] [mqtt-client] - id=0x6c8efa0: Subscribe 3 completed with error code 0
Mar 30 22:48:37 ip-172-31-93-174 web: Subscribed to mqtt Not yet unsubscribed
Mar 30 22:48:37 ip-172-31-93-174 web: [SN: ##-###] Trying with IP 1.2.3.4 and Port 1234
Mar 30 22:48:37 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:48:37Z] [00007fe97645d7c0] [mqtt-client] - id=0x6c8c760: Starting publish 3 to topic $aws/things/##-###/shadow/update
Mar 30 22:48:37 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:48:37Z] [00007fe965189700] [task-scheduler] - id=0x7fe954001bb0: Scheduling schedule_cross_thread_tasks task for immediate execution
Mar 30 22:48:37 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:48:37Z] [00007fe965189700] [task-scheduler] - id=0x7fe954001bb0: Running schedule_cross_thread_tasks task with <Running> status
Mar 30 22:48:37 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:48:37Z] [00007fe965189700] [websocket] - id=0x7fe954016e20: Enqueuing outgoing frame with opcode=2(binary) length=188 fin=T
Mar 30 22:48:37 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:48:37Z] [00007fe965189700] [task-scheduler] - id=0x7fe954016e98: Scheduling websocket_move_synced_data_to_thread task for immediate execution
Mar 30 22:48:37 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:48:37Z] [00007fe965189700] [task-scheduler] - id=0x7fe954016e98: Running websocket_move_synced_data_to_thread task with <Running> status
Mar 30 22:48:37 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:48:37Z] [00007fe965189700] [task-scheduler] - id=0x7fe954000f30: Scheduling socket_written_task task for immediate execution
Mar 30 22:48:37 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:48:37Z] [00007fe965189700] [task-scheduler] - id=0x7fe954000f30: Running socket_written_task task with <Running> status
Mar 30 22:48:37 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:48:37Z] [00007fe965189700] [mqtt-client] - id=0x6c8c760: Publish 3 complete
Mar 30 22:48:37 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:48:37Z] [00007fe965189700] [websocket] - id=0x7fe95402da70: Enqueuing outgoing frame with opcode=2(binary) length=4 fin=T
Mar 30 22:48:37 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:48:37Z] [00007fe965189700] [task-scheduler] - id=0x7fe95402dae8: Scheduling websocket_move_synced_data_to_thread task for immediate execution
Mar 30 22:48:37 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:48:37Z] [00007fe965189700] [task-scheduler] - id=0x7fe95402dae8: Running websocket_move_synced_data_to_thread task with <Running> status
Mar 30 22:48:37 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:48:37Z] [00007fe965189700] [task-scheduler] - id=0x7fe95402b7f0: Scheduling socket_written_task task for immediate execution
Mar 30 22:48:37 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:48:37Z] [00007fe965189700] [task-scheduler] - id=0x7fe95402b7f0: Running socket_written_task task with <Running> status
Mar 30 22:48:37 ip-172-31-93-174 web: Subscription message in!
Mar 30 22:48:38 ip-172-31-93-174 web: [SN: ##-###] Disconnected from central server. Stopping active script.
Mar 30 22:48:38 ip-172-31-93-174 web: Disconnect info:
Mar 30 22:48:38 ip-172-31-93-174 web: Code: 4300
Mar 30 22:48:38 ip-172-31-93-174 web: Reason: 'Client requested close'
Mar 30 22:48:38 ip-172-31-93-174 web: [SN: ##-###] Stopping everything!
Mar 30 22:48:38 ip-172-31-93-174 web: [SN: ##-###] Stopping active script
Mar 30 22:48:38 ip-172-31-93-174 web: Unable to stop everything: TypeError: Cannot read properties of null (reading 'StopActiveScript')
Mar 30 22:48:38 ip-172-31-93-174 web: at /var/app/current/websocket-server.js:421:19
Mar 30 22:48:38 ip-172-31-93-174 web: at new Promise (<anonymous>)
Mar 30 22:48:38 ip-172-31-93-174 web: at stopActiveScript (/var/app/current/websocket-server.js:420:9)
Mar 30 22:48:38 ip-172-31-93-174 web: at /var/app/current/websocket-server.js:456:3
Mar 30 22:48:38 ip-172-31-93-174 web: at new Promise (<anonymous>)
Mar 30 22:48:38 ip-172-31-93-174 web: at stopEverything (/var/app/current/websocket-server.js:455:9)
Mar 30 22:48:38 ip-172-31-93-174 web: at WebSocket.<anonymous> (/var/app/current/websocket-server.js:262:4)
Mar 30 22:48:38 ip-172-31-93-174 web: at WebSocket.emit (node:events:525:35)
Mar 30 22:48:38 ip-172-31-93-174 web: at WebSocket.emitClose (/var/app/current/node_modules/ws/lib/websocket.js:258:10)
Mar 30 22:48:38 ip-172-31-93-174 web: at Socket.socketOnClose (/var/app/current/node_modules/ws/lib/websocket.js:1260:15)
Mar 30 22:48:38 ip-172-31-93-174 web: [SN: ##-###] Failed to stop everything
Mar 30 22:48:38 ip-172-31-93-174 web: 1
Mar 30 22:48:38 ip-172-31-93-174 web: 2 ##-###
Mar 30 22:48:38 ip-172-31-93-174 web: [SN: ##-###] Trying to unsubscribe. Subscribed
Mar 30 22:48:38 ip-172-31-93-174 web: [DEBUG] [2023-03-30T22:48:38Z] [00007fe97645d7c0] [mqtt-client] - id=0x6c8efa0: Starting unsubscribe 4
Mar 30 22:48:38 ip-172-31-93-174 web: /tmp/start-7e35a7e6.sh: line 1: 31627 Segmentation fault      node index.js
samuelplexcelon commented 1 year ago

@jmklix Has there been any update on this? I'm looking to get this stable next week if possible. I'll switch over to a different lib in the meantime if we can't get it resolved

TwistedTwigleg commented 1 year ago

Hey, apologizes for the long delay! I have tried this week to reproduce this issue again but was unsuccessful in getting a segfault on the subscribe or unsubscribe, even when doing many back-to-back and without using await. Looking at the logs provided, unfortunately I do not really see anything that gives too much of a potential lead to work with. Nothing in the code provided looks like it would be causing this issue from what I can tell.

If at all possible, can you provide a minimal project that exhibits the segfault that you are seeing? Despite my best efforts I haven't been able to reproduce the issue with subscribes/unsubscribes, so if you are able to provide a minimal, complete code sample/example project that shows the issue, that would be extremely helpful in making it where we can diagnose the segfault and fix it.

jthoward64 commented 1 year ago

I am taking over this issue on our side, looks like some extra synchronization (queueing unsubscribe behind subscribe) solved our initial problem (although it is possible that the deeper issue remains). However I am still running into this error:

I've been running a local test server on a raspi 3, and got this just now:

[SN: CD-118] Failed to stop everything
1
2 CD-118
[SN: CD-118] Trying to unsubscribe. Subscribed
Fatal error condition occurred in /aws-crt-nodejs/crt/aws-c-common/source/allocator.c:210: allocator->mem_release != ((void *)0)
Exiting Application
No call stack information available

I'm not sure if it's the same error I get in the EC2, but I figured it might be related.

Based on GDB, it looks like this is the call stack that is crashing

Thread 8 "AwsEventLoop 1" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xfffff4def0a0 (LWP 10579)]
0x0000fffff5046dac in aws_mem_release () from /home/ubuntu/project-name/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
(gdb) where
#0  0x0000fffff5046dac in aws_mem_release () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
#1  0x0000fffff4ec5d64 in aws_mqtt_topic_tree_transaction_commit () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
#2  0x0000fffff4ebac24 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
#3  0x0000fffff4ec0734 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
#4  0x0000fffff4ef3d54 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
#5  0x0000fffff5055e54 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
#6  0x0000fffff4efd998 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
#7  0x0000fffff50541e8 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
#8  0x0000fffff7b4d5c8 in start_thread (arg=0x0) at ./nptl/pthread_create.c:442
#9  0x0000fffff7bb5d1c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:79

Also note that this is on the same Pi 3 that generated that error before

$ uname -a
Linux ubuntu 5.15.0-1029-raspi #31-Ubuntu SMP PREEMPT Sat Apr 22 12:26:40 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux

Seems like the condition AWS_FATAL_PRECONDITION(allocator->mem_release != NULL); here is failing, any ideas as to where my next target should be as far as investigating this issue? I can't find the source for aws_mqtt_topic_tree_transaction_commit and I'm not sure where it is called from. Thanks.

TwistedTwigleg commented 1 year ago

Thanks for the reply @jthoward64

Based on the new information, it seems like it might be something with the topic tree and the memory within the topic tree that is the issue then. I think it could be that some part of memory is being freed twice, leading to the allocator mem_release being set to null. Regardless, this new info gives a rough location to start looking at for clues as to what could be going on.

The aws_mqtt_topic_tree_transaction_commit function is located here in aws-c-mqtt: https://github.com/awslabs/aws-c-mqtt/blob/e6ea3cc88d1654515519701b060a4c21098fcc8c/source/topic_tree.c#L788. I haven't found where in that function it would be calling into aws_mem_release yet though, but my guess is that it's somewhere in s_topic_tree_action_commit. I am looking through the code to see if I can gleam what might be going on.

Edit: As to where the function is being called from, my guess would be this line in client.c in aws-c-mqtt: https://github.com/awslabs/aws-c-mqtt/blob/e6ea3cc88d1654515519701b060a4c21098fcc8c/source/client.c#L2843

jthoward64 commented 1 year ago

Thanks for the reply @jthoward64

Based on the new information, it seems like it might be something with the topic tree and the memory within the topic tree that is the issue then. I think it could be that some part of memory is being freed twice, leading to the allocator mem_release being set to null. Regardless, this new info gives a rough location to start looking at for clues as to what could be going on.

The aws_mqtt_topic_tree_transaction_commit function is located here in aws-c-mqtt: https://github.com/awslabs/aws-c-mqtt/blob/e6ea3cc88d1654515519701b060a4c21098fcc8c/source/topic_tree.c#L788. I haven't found where in that function it would be calling into aws_mem_release yet though, but my guess is that it's somewhere in s_topic_tree_action_commit. I am looking through the code to see if I can gleam what might be going on.

Edit: As to where the function is being called from, my guess would be this line in client.c in aws-c-mqtt: https://github.com/awslabs/aws-c-mqtt/blob/e6ea3cc88d1654515519701b060a4c21098fcc8c/source/client.c#L2843

I think MAYBE it is being called as part of aws_string_destory (if GDB is right) based on this decomplication:

...
   0x0000fffff4ec5d18 <+1400>:  ldr     x6, [x1]
   0x0000fffff4ec5d1c <+1404>:  mov     w2, #0x1402                     // #5122
   0x0000fffff4ec5d20 <+1408>:  mov     w1, #0x6                        // #6
   0x0000fffff4ec5d24 <+1412>:  blr     x6
   0x0000fffff4ec5d28 <+1416>:  ldr     x0, [sp, #208]
   0x0000fffff4ec5d2c <+1420>:  cbz     x0, 0xfffff4ec5d44 <aws_mqtt_topic_tree_transaction_commit+1444>
   0x0000fffff4ec5d30 <+1424>:  add     x1, x0, #0x10
   0x0000fffff4ec5d34 <+1428>:  str     x0, [x28, #24]
   0x0000fffff4ec5d38 <+1432>:  add     x1, x1, x21
   0x0000fffff4ec5d3c <+1436>:  str     x1, [x28, #8]
   0x0000fffff4ec5d40 <+1440>:  b       0xfffff4ec5c98 <aws_mqtt_topic_tree_transaction_commit+1272>
   0x0000fffff4ec5d44 <+1444>:  ldr     x1, [sp, #184]
   0x0000fffff4ec5d48 <+1448>:  add     x2, sp, #0xd0
   0x0000fffff4ec5d4c <+1452>:  add     x0, x28, #0x10
   0x0000fffff4ec5d50 <+1456>:  str     x25, [sp, #208]
   0x0000fffff4ec5d54 <+1460>:  bl      0xfffff4e76760 <aws_hash_table_foreach@plt>
   0x0000fffff4ec5d58 <+1464>:  ldr     x27, [sp, #152]
   0x0000fffff4ec5d5c <+1468>:  ldr     x0, [x27, #24]
   0x0000fffff4ec5d60 <+1472>:  bl      0xfffff4e75770 <aws_string_destroy@plt>
=> 0x0000fffff4ec5d64 <+1476>:  strb    wzr, [x27, #32]
   0x0000fffff4ec5d68 <+1480>:  ldr     x0, [sp, #208]
   0x0000fffff4ec5d6c <+1484>:  b       0xfffff4ec5d30 <aws_mqtt_topic_tree_transaction_commit+1424>
   0x0000fffff4ec5d70 <+1488>:  bl      0xfffff4ec4570
   0x0000fffff4ec5d74 <+1492>:  adrp    x1, 0xfffff505e000
   0x0000fffff4ec5d78 <+1496>:  adrp    x0, 0xfffff505e000
   0x0000fffff4ec5d7c <+1500>:  add     x1, x1, #0xf0
   0x0000fffff4ec5d80 <+1504>:  add     x0, x0, #0x138
   0x0000fffff4ec5d84 <+1508>:  mov     w2, #0x152                      // #338
   0x0000fffff4ec5d88 <+1512>:  bl      0xfffff4e75f00 <aws_fatal_assert@plt>
jthoward64 commented 1 year ago
Web socket server connection
We connected to the central server!
Central connection socket once message
Subscribing to shadow client
subscribing to $aws/things/0000-0000/shadow/update/accepted

Thread 8 "AwsEventLoop 1" hit Breakpoint 1, 0x0000fffff4ec57b0 in aws_mqtt_topic_tree_transaction_commit () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
(gdb) where
#0  0x0000fffff4ec57b0 in aws_mqtt_topic_tree_transaction_commit () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
#1  0x0000fffff4ebb96c in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
#2  0x0000fffff4ec0734 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
#3  0x0000fffff4ef3d54 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
#4  0x0000fffff5055e54 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
#5  0x0000fffff4efd998 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
#6  0x0000fffff50541e8 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
#7  0x0000fffff7b4d5c8 in start_thread (arg=0x0) at ./nptl/pthread_create.c:442
#8  0x0000fffff7bb5d1c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:79
(gdb) c
Continuing.
Subscribed to mqtt Not yet unsubscribed
*T* publishing to $aws/things/0000-0000/shadow/update
Subscription message in!
Subscription message in!
[SN: 0000-0000] incoming mqtt message:
 ...messages and unrelated logs...
[SN: 0000-0000] Trying to unsubscribe. Subscribed
*T* unsubscribing from $aws/things/0000-0000/shadow/update
*T* has subscribed, unsubscribing

Thread 8 "AwsEventLoop 1" hit Breakpoint 1, 0x0000fffff4ec57b0 in aws_mqtt_topic_tree_transaction_commit () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
(gdb) where
#0  0x0000fffff4ec57b0 in aws_mqtt_topic_tree_transaction_commit () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
#1  0x0000fffff4ebac24 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
#2  0x0000fffff4ec0734 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
#3  0x0000fffff4ef3d54 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
#4  0x0000fffff5055e54 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
#5  0x0000fffff4efd998 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
#6  0x0000fffff50541e8 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
#7  0x0000fffff7b4d5c8 in start_thread (arg=0x0) at ./nptl/pthread_create.c:442
#8  0x0000fffff7bb5d1c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:79
(gdb) c
Continuing.
[SN: 0000-0000] MQTT UNSUB
*T* publishing to $aws/things/0000-0000/shadow/update

Alright, it looks like aws_mqtt_topic_tree_transaction_commit is being called twice, once in our connection handler and once in our disconnection handler. Is it possible that them running back-to-back is corrupting the application state somehow?

TwistedTwigleg commented 1 year ago

I think MAYBE it is being called as part of aws_string_destory (if GDB is right) based on this decomplication:

Hmm, based on the de-compilation then, my guess is the issue is here: https://github.com/awslabs/aws-c-mqtt/blob/e6ea3cc88d1654515519701b060a4c21098fcc8c/source/topic_tree.c#L500

It's in s_topic_tree_action_commit, it's right after aws_hash_table_foreach, and it would be triggered upon an unsubscribe as part of AWS_MQTT_TOPIC_TREE_REMOVE (removing the subscription from the topic tree).


Alright, it looks like aws_mqtt_topic_tree_transaction_commit is being called twice, once in our connection handler and once in our disconnection handler. Is it possible that them running back-to-back is corrupting the application state somehow?

Could be. I don't think its supposed to remove the same topic from the tree twice, which might be part of the problem.

How are you disconnecting? Are you calling disconnect() on the MQTT connection? Edit: Read through the issue again and remembered it was due to a socket disconnect. Please disregard the above question.

jthoward64 commented 1 year ago

It's in s_topic_tree_action_commit, it's right after aws_hash_table_foreach, and it would be triggered upon an unsubscribe as part of AWS_MQTT_TOPIC_TREE_REMOVE (removing the subscription from the topic tree).

Makes Sense


How are you disconnecting? Are you calling disconnect() on the MQTT connection?

I am just taking over this project, but I don't think so. All I see is this.#subClient.unsubscribe('$aws/things/${aString}/shadow/update');. While there is some more stuff in our code now, the general flow is the same as the code in the initial comment

TwistedTwigleg commented 1 year ago

If you can and do not mind, can you try seeing if making the change in this PR fixes the segfault? https://github.com/awslabs/aws-c-mqtt/pull/289

I still haven't been able to reproduce the issue so far, but I am trying to see what I can do on my end. Now that we have a better idea, I'm hoping to be able to be able to reproduce it in some capacity on my end. I am still working on trying to repro the issue and I'll let you know if I am able to reproduce it!

jthoward64 commented 1 year ago

I'm having an issue building aws-crt, that I am trying to figure out, pardon the delay

For reference:

/home/ubuntu/aws-crt-nodejs/crt/aws-c-mqtt/include/aws/mqtt/mqtt.h:14:1: error: unknown type name 'AWS_PUSH_SANE_WARNING_LEVEL'
AWS_PUSH_SANE_WARNING_LEVEL
^
TwistedTwigleg commented 1 year ago

Not a worry - take as long as you need!

It might be because it requires a newer aws-c-common in addition to aws-c-mqtt. What might work better is use the v0.8.11 release (https://github.com/awslabs/aws-c-mqtt/tree/v0.8.11) and then apply the 3 line change manually and see if that works.

That way you should be able to avoid the header changes recently made across all the CRTs and not need to update aws-c-common and the like. Sorry, I forgot that we made those changes recently, as our CI always pulls the latest but the Javascript CRT doesn't point to the latest just yet.

jthoward64 commented 1 year ago

That does seem to be the problem, but that release did not fix the issue. Even after outright deleting those lines I got clang: error: no such file or directory: '/home/ubuntu/aws-crt-nodejs/crt/aws-c-mqtt/source/mqtt3_to_mqtt5_adapter.c'. I don't think the current states of the two repositories are directly compatible for building locally (at least outside of aws labs)

TwistedTwigleg commented 1 year ago

Hmm, that is strange. The mqtt311 to mqtt5 adapter stuff shouldn't be in the v0.8.11 release. Is it possible you have older build files that are pointing to the PR still rather than v0.8.11?

TwistedTwigleg commented 1 year ago

Just wanted to give an update: I'm still trying to reproduce this issue but have been unsuccessful, even knowing that it has to do directly with aws_mqtt_topic_tree_transaction_commit. I'm trying to modify this test and see if I can get a segfault, but so far I have been unsuccessful in my attempts to reproduce what you are seeing. I am still trying though, as I think we're on the right track.

I also realized that it might be easier if I also make a quick PR with the changes in aws-c-mqtt and all the other submodules directly in aws-crt-nodejs, so hopefully you can just use the branch there in aws-c-mqtt instead of having to sort out the submodules yourself. Here's the PR which, I believe you should be able to just clone and then build the CRT like normal: https://github.com/awslabs/aws-crt-nodejs/pull/461

jthoward64 commented 1 year ago

Sounds good, thanks; I'll give that a shot tomorrow morning

jthoward64 commented 1 year ago

Just wanted to give an update: I'm still trying to reproduce this issue but have been unsuccessful, even knowing that it has to do directly with aws_mqtt_topic_tree_transaction_commit. I'm trying to modify this test and see if I can get a segfault, but so far I have been unsuccessful in my attempts to reproduce what you are seeing. I am still trying though, as I think we're on the right track.

I also realized that it might be easier if I also make a quick PR with the changes in aws-c-mqtt and all the other submodules directly in aws-crt-nodejs, so hopefully you can just use the branch there in aws-c-mqtt instead of having to sort out the submodules yourself. Here's the PR which, I believe you should be able to just clone and then build the CRT like normal: #461

Also after this I'm going to try and file down the repo I'm getting the error in into a minimal reproduceable example, if successful I'll send it here.

TwistedTwigleg commented 1 year ago

Great, that would be awesome. Thanks so much for the help and debugging - I think we're getting close to figuring it out and finding a fix.

jthoward64 commented 1 year ago

We are now thinking that it might be a pi-specific (whether it be ARM, the node build, or whatever else) issue (just a guess for now), will let you know as we learn more

jthoward64 commented 1 year ago
Fatal error condition occurred in /aws-crt-nodejs/crt/aws-c-common/source/allocator.c:210: allocator->mem_release != ((void *)0)
Exiting Application
No call stack information available

Thread 8 "AwsEventLoop 1" received signal SIGABRT, Aborted.
[Switching to Thread 0xfffff4def0a0 (LWP 88875)]
__pthread_kill_implementation (threadid=281474789994656, signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
44      ./nptl/pthread_kill.c: No such file or directory.
(gdb) where
#0  __pthread_kill_implementation (threadid=281474789994656, signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
#1  0x0000fffff7b4f254 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2  0x0000fffff7b0a67c in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3  0x0000fffff7af7130 in __GI_abort () at ./stdlib/abort.c:79
#4  0x0000fffff5047594 in aws_fatal_assert () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
#5  0x0000fffff5046de4 in aws_mem_release () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
#6  0x0000fffff4eba098 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
#7  0x0000fffff5055360 in aws_ref_count_release () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
#8  0x0000fffff4ec589c in aws_mqtt_topic_tree_transaction_commit () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
#9  0x0000fffff4ebb96c in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
#10 0x0000fffff4ec0734 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
#11 0x0000fffff4ef3d54 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
#12 0x0000fffff5055e54 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
#13 0x0000fffff4efd998 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
#14 0x0000fffff50541e8 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node
#15 0x0000fffff7b4d5c8 in start_thread (arg=0x0) at ./nptl/pthread_create.c:442
#16 0x0000fffff7bb5d1c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:79

Different gdb trace, seems like same issue

TwistedTwigleg commented 1 year ago

Thanks for the update! I agree that it looks like it's the same issue based on the newest trace.

I'm trying to run the aws-c-mqtt tests on a Raspberry PI and will see if I can reproduce the issue there, since I've so far been unsuccessful on my Mac. I'm hoping that, if it's a Raspberry PI specific issue, I'll be able to see it on my Pi.

jthoward64 commented 1 year ago

I've got a puppeteer script that I am using to more frequently reproduce errors, but I have yet to successfully work out a script to reproduce the error outside of our codebase. However, on Tuesday we are going to move this code over to our dev server (x86 EC2 instance) so we can see for sure if it is RPi/Linux+ARM specific

jthoward64 commented 1 year ago

Looks like we're gonna hold off on pushing to dev, but I did get our server running on my windows machine under WSL, and while I do still get the original error, I have yet to see the allocator->mem_release != ((void *)0) crash.

Also, I've started using the segfault handler npm package and this is it's output

################################################################################
Stack trace:
################################################################################
/home/tagho/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-x64/aws-crt-nodejs.node(aws_backtrace_print+0x41) [0x7f1cbe3dfd31]
/home/tagho/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-x64/aws-crt-nodejs.node(+0x80a33) [0x7f1cbe080a33]
/usr/lib/libc.so.6(+0x39ab0) [0x7f1cc73a3ab0]
/home/tagho/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-x64/aws-crt-nodejs.node(aws_mem_release+0x9) [0x7f1cbe3d28c9]
/home/tagho/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-x64/aws-crt-nodejs.node(aws_mqtt_topic_tree_transaction_commit+0x6d9) [0x7f1cbe0ca699]
/home/tagho/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-x64/aws-crt-nodejs.node(+0xbdc82) [0x7f1cbe0bdc82]
/home/tagho/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-x64/aws-crt-nodejs.node(+0xc4a5c) [0x7f1cbe0c4a5c]
/home/tagho/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-x64/aws-crt-nodejs.node(+0x10f766) [0x7f1cbe10f766]
/home/tagho/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-x64/aws-crt-nodejs.node(+0x3e262e) [0x7f1cbe3e262e]
/home/tagho/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-x64/aws-crt-nodejs.node(+0x11ad38) [0x7f1cbe11ad38]
/home/tagho/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-x64/aws-crt-nodejs.node(+0x3e0695) [0x7f1cbe3e0695]
/usr/lib/libc.so.6(+0x8744b) [0x7f1cc73f144b]
/usr/lib/libc.so.6(+0x10ae40) [0x7f1cc7474e40]
P.S. here is a TON of stack traces from the pi, again just for reference ```log [UUID: fd29b245-a960-45c0-81df-bc6100306e13] CENTRAL SERVER: {"deviceName":"0000-0000","testId":1134} [SN: 0000-0000] Trying with IP 192.168.1.181 and Port 11000 [SN: 0000-0000] Stopping everything! [... irrelevant logs] [SN: 0000-0000] Trying to unsubscribe. We stopped everything! [SN: 0000-0000] MQTT UNSUB Thread 8 "AwsEventLoop 1" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0xfffff4def0a0 (LWP 80725)] 0x0000fffff5046dac in aws_mem_release () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node (gdb) where #0 0x0000fffff5046dac in aws_mem_release () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #1 0x0000fffff4eba3a0 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #2 0x0000fffff4ec1cbc in mqtt_request_complete () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #3 0x0000fffff4ec2000 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #4 0x0000fffff4ec0408 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #5 0x0000fffff4ef0578 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #6 0x0000fffff4ef2b4c in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #7 0x0000fffff4ef3188 in aws_websocket_decoder_process () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #8 0x0000fffff4eeec7c in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #9 0x0000fffff4e9b4ec in aws_h1_connection_try_process_read_messages () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #10 0x0000fffff4e9bb74 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #11 0x0000fffff4f0697c in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #12 0x0000fffff4f088cc in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #13 0x0000fffff4f09048 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #14 0x0000fffff4f04d38 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #15 0x0000fffff4efd7c0 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #16 0x0000fffff50541e8 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nod--Type for more, q to quit, c to continue without paging--c ejs.node #17 0x0000fffff7b4d5c8 in start_thread (arg=0x0) at ./nptl/pthread_create.c:442 #18 0x0000fffff7bb5d1c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:79 ``` ```log Thread 8 "AwsEventLoop 1" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0xfffff4def0a0 (LWP 88113)] 0x0000fffff5046dac in aws_mem_release () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node (gdb) where #0 0x0000fffff5046dac in aws_mem_release () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #1 0x0000fffff4eba3a0 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #2 0x0000fffff4ec1cbc in mqtt_request_complete () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #3 0x0000fffff4ec2000 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #4 0x0000fffff4ec0408 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #5 0x0000fffff4ef0578 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #6 0x0000fffff4ef2b4c in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #7 0x0000fffff4ef3188 in aws_websocket_decoder_process () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #8 0x0000fffff4eeec7c in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #9 0x0000fffff4e9b4ec in aws_h1_connection_try_process_read_messages () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #10 0x0000fffff4e9bb74 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #11 0x0000fffff4f0697c in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #12 0x0000fffff4f088cc in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #13 0x0000fffff4f09048 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #14 0x0000fffff4f04d38 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #15 0x0000fffff4efd7c0 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #16 0x0000fffff50541e8 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #17 0x0000fffff7b4d5c8 in start_thread (arg=0x0) at ./nptl/pthread_create.c:442 #18 0x0000fffff7bb5d1c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:79 ``` ```log Thread 8 "AwsEventLoop 1" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0xfffff4def0a0 (LWP 90777)] 0x0000440000000000 in ?? () (gdb) where #0 0x0000440000000000 in ?? () #1 0x0000fffff4eba3a0 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #2 0x0000fffff4ec1cbc in mqtt_request_complete () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #3 0x0000fffff4ec2000 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #4 0x0000fffff4ec0408 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #5 0x0000fffff4ef0578 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #6 0x0000fffff4ef2b4c in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #7 0x0000fffff4ef3188 in aws_websocket_decoder_process () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #8 0x0000fffff4eeec7c in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #9 0x0000fffff4e9b4ec in aws_h1_connection_try_process_read_messages () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #10 0x0000fffff4e9bb74 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #11 0x0000fffff4f0697c in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #12 0x0000fffff4f088cc in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #13 0x0000fffff4f09048 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #14 0x0000fffff4f04d38 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #15 0x0000fffff4efd7c0 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #16 0x0000fffff50541e8 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #17 0x0000fffff7b4d5c8 in start_thread (arg=0x0) at ./nptl/pthread_create.c:442 #18 0x0000fffff7bb5d1c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:79 ``` ```log About to enter native code for: mqtt.MqttClientConnection.unsubscribe Trace at #dispatchUnsubscribe (/home/ubuntu/companion-instrument-server/aws-shadow-client/index.js:251:15) at #dispatchAction (/home/ubuntu/companion-instrument-server/aws-shadow-client/index.js:297:51) at Timeout.dispatch [as _onTimeout] (/home/ubuntu/companion-instrument-server/aws-shadow-client/index.js:331:37) at listOnTimeout (node:internal/timers:564:17) at process.processTimers (node:internal/timers:507:7) Thread 8 "AwsEventLoop 1" received signal SIGBUS, Bus error. [Switching to Thread 0xfffff4def0a0 (LWP 112455)] 0x0000000000000051 in ?? () (gdb) where #0 0x0000000000000051 in ?? () #1 0x0000fffff4ec5d64 in aws_mqtt_topic_tree_transaction_commit () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #2 0x0000fffff4ebac24 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #3 0x0000fffff4ec0734 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #4 0x0000fffff4ef3d54 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #5 0x0000fffff5055e54 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #6 0x0000fffff4efd998 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #7 0x0000fffff50541e8 in ?? () from /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node #8 0x0000fffff7b4d5c8 in start_thread (arg=0x0) at ./nptl/pthread_create.c:442 #9 0x0000fffff7bb5d1c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:79 ``` ```log About to enter native code for: mqtt.MqttClientConnection.unsubscribe Trace at #dispatchUnsubscribe (/home/ubuntu/companion-instrument-server/aws-shadow-client/index.js:251:15) at #dispatchAction (/home/ubuntu/companion-instrument-server/aws-shadow-client/index.js:297:51) at Timeout.dispatch [as _onTimeout] (/home/ubuntu/companion-instrument-server/aws-shadow-client/index.js:331:37) at listOnTimeout (node:internal/timers:564:17) at process.processTimers (node:internal/timers:507:7) Fatal error condition occurred in /aws-crt-nodejs/crt/aws-c-common/source/allocator.c:210: allocator->mem_release != ((void *)0) Exiting Application Exited native code for: mqtt.MqttClientConnection.unsubscribe { packet_id: 8 } [SN: 0000-0000] MQTT UNSUB About to enter native code for: mqtt.MqttClientConnection.publish Trace at #dispatchPublish (/home/ubuntu/companion-instrument-server/aws-shadow-client/index.js:269:15) at #dispatchAction (/home/ubuntu/companion-instrument-server/aws-shadow-client/index.js:300:47) at Timeout.dispatch [as _onTimeout] (/home/ubuntu/companion-instrument-server/aws-shadow-client/index.js:331:37) at listOnTimeout (node:internal/timers:564:17) at process.processTimers (node:internal/timers:507:7) ################################################################################ Stack trace: ################################################################################ /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace_print+0x4c) [0xffffa4437150] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_fatal_assert+0x50) [0xffffa442a870] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_mem_release+0x44) [0xffffa442a0b4] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0xb6160) [0xffffa4246160] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(mqtt_request_complete+0xcc) [0xffffa424e2ec] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0xbe630) [0xffffa424e630] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0xbc998) [0xffffa424c998] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0xf62a8) [0xffffa42862a8] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0xf887c) [0xffffa428887c] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_websocket_decoder_process+0x38) [0xffffa4288eb8] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0xf49ac) [0xffffa42849ac] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_h1_connection_try_process_read_messages+0x3b8) [0xffffa4226c68] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x97304) [0xffffa4227304] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x110f4c) [0xffffa42a0f4c] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x112edc) [0xffffa42a2edc] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x113658) [0xffffa42a3658] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x10f308) [0xffffa429f308] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x107d80) [0xffffa4297d80] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x2a7b78) [0xffffa4437b78] /lib/aarch64-linux-gnu/libc.so.6(+0x7d5c8) [0xffffa5f2d5c8] /lib/aarch64-linux-gnu/libc.so.6(+0xe5d1c) [0xffffa5f95d1c] Signal received: 1852270931, errno: 1914727521 ################################################################################ Stack trace: ################################################################################ /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace_print+0x4c) [0xffffa4437150] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x7f1c8) [0xffffa420f1c8] linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffffa63d47dc] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace_print+0x28) [0xffffa443712c] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x7f1c8) [0xffffa420f1c8] linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffffa63d47dc] /lib/aarch64-linux-gnu/libc.so.6(+0x7f200) [0xffffa5f2f200] /lib/aarch64-linux-gnu/libc.so.6(raise+0x1c) [0xffffa5eea67c] /lib/aarch64-linux-gnu/libc.so.6(abort+0xe4) [0xffffa5ed7130] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x29a874) [0xffffa442a874] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_mem_release+0x44) [0xffffa442a0b4] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0xb6160) [0xffffa4246160] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(mqtt_request_complete+0xcc) [0xffffa424e2ec] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0xbe630) [0xffffa424e630] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0xbc998) [0xffffa424c998] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0xf62a8) [0xffffa42862a8] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0xf887c) [0xffffa428887c] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_websocket_decoder_process+0x38) [0xffffa4288eb8] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0xf49ac) [0xffffa42849ac] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_h1_connection_try_process_read_messages+0x3b8) [0xffffa4226c68] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x97304) [0xffffa4227304] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x110f4c) [0xffffa42a0f4c] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x112edc) [0xffffa42a2edc] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x113658) [0xffffa42a3658] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x10f308) [0xffffa429f308] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x107d80) [0xffffa4297d80] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x2a7b78) [0xffffa4437b78] /lib/aarch64-linux-gnu/libc.so.6(+0x7d5c8) [0xffffa5f2d5c8] /lib/aarch64-linux-gnu/libc.so.6(+0xe5d1c) [0xffffa5f95d1c] ``` ```log Web socket server connection We connected to the central server! releasing port: 11000 PORT 11000 RELEASED Central connection socket once message [UUID: 571804be-2256-4fe9-b08e-bd805216fcd2] CENTRAL SERVER: {"deviceName":"0000-0000","testId":2203} About to enter native code for: mqtt.MqttClientConnection.subscribe Trace at #dispatchSubscribe (/home/ubuntu/companion-instrument-server/aws-shadow-client/index.js:228:15) at #dispatchAction (/home/ubuntu/companion-instrument-server/aws-shadow-client/index.js:296:49) at Timeout.dispatch [as _onTimeout] (/home/ubuntu/companion-instrument-server/aws-shadow-client/index.js:333:37) at listOnTimeout (node:internal/timers:564:17) at process.processTimers (node:internal/timers:507:7) Exited native code for: mqtt.MqttClientConnection.subscribe [SN: 0000-0000] Trying with IP 192.168.1.181 and Port 11000 About to enter native code for: mqtt.MqttClientConnection.publish Trace at #dispatchPublish (/home/ubuntu/companion-instrument-server/aws-shadow-client/index.js:271:15) at #dispatchAction (/home/ubuntu/companion-instrument-server/aws-shadow-client/index.js:302:47) at Timeout.dispatch [as _onTimeout] (/home/ubuntu/companion-instrument-server/aws-shadow-client/index.js:333:37) at listOnTimeout (node:internal/timers:564:17) at process.processTimers (node:internal/timers:507:7) Exited native code for: mqtt.MqttClientConnection.publish [SN: 0000-0000] Disconnected from central server. Stopping active script. Disconnect info: Code: 4300 Reason: 'Client requested close' [SN: 0000-0000] Stopping everything! [SN: 0000-0000] gcodeGrpcClient is nullish, skipping stopEverything [SN: 0000-0000] Stopping active script [SN: 0000-0000] gcodeGrpcClient is nullish, skipping stopActiveScript [SN: 0000-0000] Sending instruction 'A0 B0;...' to instrument. [SN: 0000-0000] cadSystemGcodeClient is nullish, skipping stopEverything [SN: 0000-0000] Sending instruction 'A0 V0;...' to instrument. [SN: 0000-0000] cadSystemGcodeClient is nullish, skipping stopEverything [SN: 0000-0000] Display Led Progress 100 [SN: 0000-0000] cadSystemClient is nullish, skipping stopEverything [SN: 0000-0000] Trying to unsubscribe. We stopped everything! About to enter native code for: mqtt.MqttClientConnection.unsubscribe Trace at #dispatchUnsubscribe (/home/ubuntu/companion-instrument-server/aws-shadow-client/index.js:251:15) at #dispatchAction (/home/ubuntu/companion-instrument-server/aws-shadow-client/index.js:299:51) at Timeout.dispatch [as _onTimeout] (/home/ubuntu/companion-instrument-server/aws-shadow-client/index.js:333:37) at listOnTimeout (node:internal/timers:564:17) at process.processTimers (node:internal/timers:507:7) Signal received: 1610765385, errno: 65535 Exited native code for: mqtt.MqttClientConnection.unsubscribe Signal received: 1852270931, errno: 1914727521 { packet_id: 10 } Signal received: 1852270931, errno: 1914727521 [SN: 0000-0000] MQTT UNSUB Signal received: 1852270931, errno: 1914727521 Signal received: 1852270931, errno: 1914727521 Signal received: 1852270931, errno: 1914727521 Signal received: 1852270931, errno: 1914727521 Signal received: 1852270931, errno: 1914727521 Signal received: 1852270931, errno: 1914727521 Signal received: 1852270931, errno: 1914727521 Signal received: 1852270931, errno: 1914727521 Signal received: 1852270931, errno: 1914727521 Signal received: 1852270931, errno: 1914727521 Signal received: 1852270931, errno: 1914727521 About to enter native code for: mqtt.MqttClientConnection.publish Trace at #dispatchPublish (/home/ubuntu/companion-instrument-server/aws-shadow-client/index.js:271:15) at #dispatchAction (/home/ubuntu/companion-instrument-server/aws-shadow-client/index.js:302:47) at Timeout.dispatch [as _onTimeout] (/home/ubuntu/companion-instrument-server/aws-shadow-client/index.js:333:37) at listOnTimeout (node:internal/timers:564:17) at process.processTimers (node:internal/timers:507:7) ################################################################################ Stack trace: ################################################################################ /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace_print+0x4c) [0xffff8c357150] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x7f1c8) [0xffff8c12f1c8] linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffff8e2fb7dc] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace_print+0x28) [0xffff8c35712c] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x7f1c8) [0xffff8c12f1c8] linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffff8e2fb7dc] /lib/aarch64-linux-gnu/libgcc_s.so.1(+0xd4d8) [0xffff8dfad4d8] /lib/aarch64-linux-gnu/libgcc_s.so.1(_Unwind_Backtrace+0x78) [0xffff8dfaec28] /lib/aarch64-linux-gnu/libc.so.6(__backtrace+0x68) [0xffff8dec30d8] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace+0x18) [0xffff8c356f18] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace_print+0x4c) [0xffff8c357150] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x7f1c8) [0xffff8c12f1c8] linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffff8e2fb7dc] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace_print+0x28) [0xffff8c35712c] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x7f1c8) [0xffff8c12f1c8] linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffff8e2fb7dc] /lib/aarch64-linux-gnu/libgcc_s.so.1(+0xd4d8) [0xffff8dfad4d8] /lib/aarch64-linux-gnu/libgcc_s.so.1(_Unwind_Backtrace+0x78) [0xffff8dfaec28] /lib/aarch64-linux-gnu/libc.so.6(__backtrace+0x68) [0xffff8dec30d8] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace+0x18) [0xffff8c356f18] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace_print+0x4c) [0xffff8c357150] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x7f1c8) [0xffff8c12f1c8] linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffff8e2fb7dc] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace_print+0x28) [0xffff8c35712c] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x7f1c8) [0xffff8c12f1c8] linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffff8e2fb7dc] /lib/aarch64-linux-gnu/libgcc_s.so.1(+0xd4d8) [0xffff8dfad4d8] /lib/aarch64-linux-gnu/libgcc_s.so.1(_Unwind_Backtrace+0x78) [0xffff8dfaec28] /lib/aarch64-linux-gnu/libc.so.6(__backtrace+0x68) [0xffff8dec30d8] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace+0x18) [0xffff8c356f18] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace_print+0x4c) [0xffff8c357150] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x7f1c8) [0xffff8c12f1c8] linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffff8e2fb7dc] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace_print+0x28) [0xffff8c35712c] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x7f1c8) [0xffff8c12f1c8] linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffff8e2fb7dc] /lib/aarch64-linux-gnu/libgcc_s.so.1(+0xd4d8) [0xffff8dfad4d8] /lib/aarch64-linux-gnu/libgcc_s.so.1(_Unwind_Backtrace+0x78) [0xffff8dfaec28] /lib/aarch64-linux-gnu/libc.so.6(__backtrace+0x68) [0xffff8dec30d8] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace+0x18) [0xffff8c356f18] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace_print+0x4c) [0xffff8c357150] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x7f1c8) [0xffff8c12f1c8] linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffff8e2fb7dc] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace_print+0x28) [0xffff8c35712c] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x7f1c8) [0xffff8c12f1c8] linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffff8e2fb7dc] /lib/aarch64-linux-gnu/libgcc_s.so.1(+0xd4d8) [0xffff8dfad4d8] /lib/aarch64-linux-gnu/libgcc_s.so.1(_Unwind_Backtrace+0x78) [0xffff8dfaec28] /lib/aarch64-linux-gnu/libc.so.6(__backtrace+0x68) [0xffff8dec30d8] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace+0x18) [0xffff8c356f18] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace_print+0x4c) [0xffff8c357150] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x7f1c8) [0xffff8c12f1c8] linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffff8e2fb7dc] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace_print+0x28) [0xffff8c35712c] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x7f1c8) [0xffff8c12f1c8] linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffff8e2fb7dc] /lib/aarch64-linux-gnu/libgcc_s.so.1(+0xd4d8) [0xffff8dfad4d8] /lib/aarch64-linux-gnu/libgcc_s.so.1(_Unwind_Backtrace+0x78) [0xffff8dfaec28] /lib/aarch64-linux-gnu/libc.so.6(__backtrace+0x68) [0xffff8dec30d8] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace+0x18) [0xffff8c356f18] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace_print+0x4c) [0xffff8c357150] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x7f1c8) [0xffff8c12f1c8] linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffff8e2fb7dc] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace_print+0x28) [0xffff8c35712c] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x7f1c8) [0xffff8c12f1c8] linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffff8e2fb7dc] /lib/aarch64-linux-gnu/libgcc_s.so.1(+0xd4d8) [0xffff8dfad4d8] /lib/aarch64-linux-gnu/libgcc_s.so.1(_Unwind_Backtrace+0x78) [0xffff8dfaec28] /lib/aarch64-linux-gnu/libc.so.6(__backtrace+0x68) [0xffff8dec30d8] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace+0x18) [0xffff8c356f18] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace_print+0x4c) [0xffff8c357150] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x7f1c8) [0xffff8c12f1c8] linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffff8e2fb7dc] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace_print+0x28) [0xffff8c35712c] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x7f1c8) [0xffff8c12f1c8] linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffff8e2fb7dc] /lib/aarch64-linux-gnu/libgcc_s.so.1(+0xd4d8) [0xffff8dfad4d8] /lib/aarch64-linux-gnu/libgcc_s.so.1(_Unwind_Backtrace+0x78) [0xffff8dfaec28] /lib/aarch64-linux-gnu/libc.so.6(__backtrace+0x68) [0xffff8dec30d8] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace+0x18) [0xffff8c356f18] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace_print+0x4c) [0xffff8c357150] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x7f1c8) [0xffff8c12f1c8] linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffff8e2fb7dc] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace_print+0x28) [0xffff8c35712c] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x7f1c8) [0xffff8c12f1c8] linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffff8e2fb7dc] /lib/aarch64-linux-gnu/libgcc_s.so.1(+0xd4d8) [0xffff8dfad4d8] /lib/aarch64-linux-gnu/libgcc_s.so.1(_Unwind_Backtrace+0x78) [0xffff8dfaec28] /lib/aarch64-linux-gnu/libc.so.6(__backtrace+0x68) [0xffff8dec30d8] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace+0x18) [0xffff8c356f18] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace_print+0x4c) [0xffff8c357150] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x7f1c8) [0xffff8c12f1c8] linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffff8e2fb7dc] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace_print+0x28) [0xffff8c35712c] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x7f1c8) [0xffff8c12f1c8] linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffff8e2fb7dc] /lib/aarch64-linux-gnu/libgcc_s.so.1(+0xd4d8) [0xffff8dfad4d8] /lib/aarch64-linux-gnu/libgcc_s.so.1(_Unwind_Backtrace+0x78) [0xffff8dfaec28] /lib/aarch64-linux-gnu/libc.so.6(__backtrace+0x68) [0xffff8dec30d8] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace+0x18) [0xffff8c356f18] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace_print+0x4c) [0xffff8c357150] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x7f1c8) [0xffff8c12f1c8] linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffff8e2fb7dc] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace_print+0x28) [0xffff8c35712c] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x7f1c8) [0xffff8c12f1c8] linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffff8e2fb7dc] /lib/aarch64-linux-gnu/libgcc_s.so.1(+0xd4d8) [0xffff8dfad4d8] /lib/aarch64-linux-gnu/libgcc_s.so.1(_Unwind_Backtrace+0x78) [0xffff8dfaec28] /lib/aarch64-linux-gnu/libc.so.6(__backtrace+0x68) [0xffff8dec30d8] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace+0x18) [0xffff8c356f18] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace_print+0x4c) [0xffff8c357150] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x7f1c8) [0xffff8c12f1c8] linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffff8e2fb7dc] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace_print+0x28) [0xffff8c35712c] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x7f1c8) [0xffff8c12f1c8] linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffff8e2fb7dc] /lib/aarch64-linux-gnu/libgcc_s.so.1(+0xd4d8) [0xffff8dfad4d8] /lib/aarch64-linux-gnu/libgcc_s.so.1(_Unwind_Backtrace+0x78) [0xffff8dfaec28] /lib/aarch64-linux-gnu/libc.so.6(__backtrace+0x68) [0xffff8dec30d8] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace+0x18) [0xffff8c356f18] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace_print+0x4c) [0xffff8c357150] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x7f1c8) [0xffff8c12f1c8] linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffff8e2fb7dc] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace_print+0x28) [0xffff8c35712c] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x7f1c8) [0xffff8c12f1c8] linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffff8e2fb7dc] /lib/aarch64-linux-gnu/libgcc_s.so.1(+0xd4d8) [0xffff8dfad4d8] ``` ```log Web socket server connection We connected to the central server! releasing port: 11000 PORT 11000 RELEASED Central connection socket once message [UUID: 043e3c60-8894-4a2e-8510-5fdb17ac9294] CENTRAL SERVER: {"deviceName":"0000-0000","testId":2249} [SN: 0000-0000] Trying with IP 192.168.1.181 and Port 11000 [SN: 0000-0000] Disconnected from central server. Stopping active script. Disconnect info: Code: 4300 Reason: 'Client requested close' [SN: 0000-0000] Stopping everything! [SN: 0000-0000] gcodeGrpcClient is nullish, skipping stopEverything [SN: 0000-0000] Stopping active script [SN: 0000-0000] gcodeGrpcClient is nullish, skipping stopActiveScript [SN: 0000-0000] Sending instruction 'A0 B0;...' to instrument. [SN: 0000-0000] cadSystemGcodeClient is nullish, skipping stopEverything [SN: 0000-0000] Sending instruction 'A0 V0;...' to instrument. [SN: 0000-0000] cadSystemGcodeClient is nullish, skipping stopEverything [SN: 0000-0000] Display Led Progress 100 [SN: 0000-0000] cadSystemClient is nullish, skipping stopEverything [SN: 0000-0000] Trying to unsubscribe. Fatal error condition occurred in /aws-crt-nodejs/crt/aws-c-common/source/allocator.c:210: allocator->mem_release != ((void *)0) Exiting Application We stopped everything! ################################################################################ Stack trace: ################################################################################ /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace_print+0x4c) [0xffff98737150] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_fatal_assert+0x50) [0xffff9872a870] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_mem_release+0x44) [0xffff9872a0b4] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_mqtt_topic_tree_transaction_commit+0x5c4) [0xffff98552554] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0xb7348) [0xffff98547348] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0xbcf24) [0xffff9854cf24] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0xfe334) [0xffff9858e334] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x2a98b4) [0xffff987398b4] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x107f58) [0xffff98597f58] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x2a7b78) [0xffff98737b78] /lib/aarch64-linux-gnu/libc.so.6(+0x7d5c8) [0xffff9b24d5c8] /lib/aarch64-linux-gnu/libc.so.6(+0xe5d1c) [0xffff9b2b5d1c] Signal received: 1852270931, errno: 1914727521 ################################################################################ Stack trace: ################################################################################ /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace_print+0x4c) [0xffff98737150] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x7f1c8) [0xffff9850f1c8] linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffff9b6f17dc] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_backtrace_print+0x28) [0xffff9873712c] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x7f1c8) [0xffff9850f1c8] linux-vdso.so.1(__kernel_rt_sigreturn+0) [0xffff9b6f17dc] /lib/aarch64-linux-gnu/libc.so.6(+0x7f200) [0xffff9b24f200] /lib/aarch64-linux-gnu/libc.so.6(raise+0x1c) [0xffff9b20a67c] /lib/aarch64-linux-gnu/libc.so.6(abort+0xe4) [0xffff9b1f7130] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x29a874) [0xffff9872a874] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_mem_release+0x44) [0xffff9872a0b4] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(aws_mqtt_topic_tree_transaction_commit+0x5c4) [0xffff98552554] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0xb7348) [0xffff98547348] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0xbcf24) [0xffff9854cf24] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0xfe334) [0xffff9858e334] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x2a98b4) [0xffff987398b4] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x107f58) [0xffff98597f58] /home/ubuntu/companion-instrument-server/node_modules/aws-crt/dist/bin/linux-arm64/aws-crt-nodejs.node(+0x2a7b78) [0xffff98737b78] /lib/aarch64-linux-gnu/libc.so.6(+0x7d5c8) [0xffff9b24d5c8] /lib/aarch64-linux-gnu/libc.so.6(+0xe5d1c) [0xffff9b2b5d1c] ```

P.P.S. Since I'm on my machine now I can get debug symbols working in GDB, here is a symbolocated trace

Thread 8 "AwsEventLoop 1" received signal SIGABRT, Aborted.
0x00007ffff6f6a26c in ?? () from /usr/lib/libc.so.6
(gdb) where
#0  0x00007ffff6f6a26c in ?? () from /usr/lib/libc.so.6
#1  0x00007ffff6f1aa08 in raise () from /usr/lib/libc.so.6
#2  0x00007ffff6f03538 in abort () from /usr/lib/libc.so.6
#3  0x00007fffee1ec1f1 in aws_fatal_assert (cond_str=0x7fffee258de8 "new_topic_filter != old_topic_filter", 
    file=0x7fffee258a80 "/home/tagho/aws-crt-nodejs/crt/aws-c-mqtt/source/topic_tree.c", line=497) at /home/tagho/aws-crt-nodejs/crt/aws-c-common/source/assert.c:17
#4  0x00007fffede02b73 in s_topic_tree_action_commit (action=0x7fffedd044d0, tree=0x55555852f758) at /home/tagho/aws-crt-nodejs/crt/aws-c-mqtt/source/topic_tree.c:497
#5  0x00007fffede038a6 in aws_mqtt_topic_tree_transaction_commit (tree=0x55555852f758, transaction=0x7fffedd045c0) at /home/tagho/aws-crt-nodejs/crt/aws-c-mqtt/source/topic_tree.c:796
#6  0x00007fffeddf6cb8 in s_unsubscribe_send (packet_id=2, is_first_attempt=true, userdata=0x555558a0e220) at /home/tagho/aws-crt-nodejs/crt/aws-c-mqtt/source/client.c:2785
#7  0x00007fffeddfbc23 in s_request_outgoing_task (task=0x555558606790, arg=0x555558606770, status=AWS_TASK_STATUS_RUN_READY)
    at /home/tagho/aws-crt-nodejs/crt/aws-c-mqtt/source/client_channel_handler.c:793
#8  0x00007fffede69314 in s_schedule_cross_thread_tasks (task=0x7fffdc016af0, arg=0x7fffdc016930, status=AWS_TASK_STATUS_RUN_READY)
    at /home/tagho/aws-crt-nodejs/crt/aws-c-io/source/channel.c:522
#9  0x00007fffee20d9fb in aws_task_run (task=0x7fffdc016af0, status=AWS_TASK_STATUS_RUN_READY) at /home/tagho/aws-crt-nodejs/crt/aws-c-common/source/task_scheduler.c:44
#10 0x00007fffee20e294 in s_run_all (scheduler=0x5555582e0ae0, current_time=12031077945719, status=AWS_TASK_STATUS_RUN_READY)
    at /home/tagho/aws-crt-nodejs/crt/aws-c-common/source/task_scheduler.c:249
#11 0x00007fffee20e0b7 in aws_task_scheduler_run_all (scheduler=0x5555582e0ae0, current_time=12031077945719) at /home/tagho/aws-crt-nodejs/crt/aws-c-common/source/task_scheduler.c:188
#12 0x00007fffede7b22f in aws_event_loop_thread (args=0x5555581311d0) at /home/tagho/aws-crt-nodejs/crt/aws-c-io/source/linux/epoll_event_loop.c:650
#13 0x00007fffee20736b in thread_fn (arg=0x5555582e0c90) at /home/tagho/aws-crt-nodejs/crt/aws-c-common/source/posix/thread.c:177
#14 0x00007ffff6f6844b in ?? () from /usr/lib/libc.so.6
#15 0x00007ffff6febe40 in ?? () from /usr/lib/libc.so.6
(gdb) up 3
#3  0x00007fffee1ec1f1 in aws_fatal_assert (cond_str=0x7fffee258de8 "new_topic_filter != old_topic_filter", 
    file=0x7fffee258a80 "/home/tagho/aws-crt-nodejs/crt/aws-c-mqtt/source/topic_tree.c", line=497) at /home/tagho/aws-crt-nodejs/crt/aws-c-common/source/assert.c:17
17          abort();
(gdb) info locals
No locals.
(gdb) up
#4  0x00007fffede02b73 in s_topic_tree_action_commit (action=0x7fffedd044d0, tree=0x55555852f758) at /home/tagho/aws-crt-nodejs/crt/aws-c-mqtt/source/topic_tree.c:497
497                                     AWS_ASSERT(new_topic_filter != old_topic_filter);
(gdb) info locals
i = 5
new_topic_filter = 0x7fffdc0404e0
old_topic_filter = 0x7fffdc0404e0
parent = 0x7fffdc03ea30
topic_offset = 29
destroy_current = false
nodes_left = 5
current = 0x7fffdc03ea30
sub_parts_len = 5

# Here I jump past the debug assertion as if the code were not compiled in debug mode and I get the same segfault as before

(gdb) j /home/tagho/aws-crt-nodejs/crt/aws-c-mqtt/source/topic_tree.c:500
Continuing at 0x7fffede02b73.
[Thread 0x7fffd4ffe6c0 (LWP 520) exited]

Thread 8 "AwsEventLoop 1" received signal SIGSEGV, Segmentation fault.
0x00007fffee20bd7d in aws_string_is_valid (str=0xe9c3470f00000000) at /home/tagho/aws-crt-nodejs/crt/aws-c-common/include/aws/common/string.inl:35
35          return str && AWS_MEM_IS_READABLE(&str->bytes[0], str->len + 1) && str->bytes[str->len] == 0;
(gdb) where
#0  0x00007fffee20bd7d in aws_string_is_valid (str=0xe9c3470f00000000) at /home/tagho/aws-crt-nodejs/crt/aws-c-common/include/aws/common/string.inl:35
#1  0x00007fffee20c0ca in aws_string_destroy (str=0xe9c3470f00000000) at /home/tagho/aws-crt-nodejs/crt/aws-c-common/source/string.c:227
#2  0x00007fffede02b83 in s_topic_tree_action_commit (action=0x0, tree=0x0) at /home/tagho/aws-crt-nodejs/crt/aws-c-mqtt/source/topic_tree.c:500
#3  0x00007fffee1ec1a9 in aws_fatal_assert (cond_str=0x7fffee258de8 "new_topic_filter != old_topic_filter", 
    file=0x7fffee258a80 "/home/tagho/aws-crt-nodejs/crt/aws-c-mqtt/source/topic_tree.c", line=497) at /home/tagho/aws-crt-nodejs/crt/aws-c-common/source/assert.c:14
#4  0x00007fffede02b73 in s_topic_tree_action_commit (action=0x7fffedd044d0, tree=0x55555853e1e8) at /home/tagho/aws-crt-nodejs/crt/aws-c-mqtt/source/topic_tree.c:497
#5  0x00007fffede038a6 in aws_mqtt_topic_tree_transaction_commit (tree=0x55555853e1e8, transaction=0x7fffedd045c0) at /home/tagho/aws-crt-nodejs/crt/aws-c-mqtt/source/topic_tree.c:796
#6  0x00007fffeddf6cb8 in s_unsubscribe_send (packet_id=2, is_first_attempt=true, userdata=0x555557b43500) at /home/tagho/aws-crt-nodejs/crt/aws-c-mqtt/source/client.c:2785
#7  0x00007fffeddfbc23 in s_request_outgoing_task (task=0x55555853c9f0, arg=0x55555853c9d0, status=AWS_TASK_STATUS_RUN_READY)
    at /home/tagho/aws-crt-nodejs/crt/aws-c-mqtt/source/client_channel_handler.c:793
#8  0x00007fffede69314 in s_schedule_cross_thread_tasks (task=0x7fffdc016af0, arg=0x7fffdc016930, status=AWS_TASK_STATUS_RUN_READY)
    at /home/tagho/aws-crt-nodejs/crt/aws-c-io/source/channel.c:522
#9  0x00007fffee20d9fb in aws_task_run (task=0x7fffdc016af0, status=AWS_TASK_STATUS_RUN_READY) at /home/tagho/aws-crt-nodejs/crt/aws-c-common/source/task_scheduler.c:44
#10 0x00007fffee20e294 in s_run_all (scheduler=0x5555582f0150, current_time=91698147246, status=AWS_TASK_STATUS_RUN_READY)
    at /home/tagho/aws-crt-nodejs/crt/aws-c-common/source/task_scheduler.c:249
#11 0x00007fffee20e0b7 in aws_task_scheduler_run_all (scheduler=0x5555582f0150, current_time=91698147246) at /home/tagho/aws-crt-nodejs/crt/aws-c-common/source/task_scheduler.c:188
#12 0x00007fffede7b22f in aws_event_loop_thread (args=0x555558141520) at /home/tagho/aws-crt-nodejs/crt/aws-c-io/source/linux/epoll_event_loop.c:650
#13 0x00007fffee20736b in thread_fn (arg=0x5555582f0300) at /home/tagho/aws-crt-nodejs/crt/aws-c-common/source/posix/thread.c:177
#14 0x00007ffff6f6844b in ?? () from /usr/lib/libc.so.6
#15 0x00007ffff6febe40 in ?? () from /usr/lib/libc.so.6

Which means it looks like our problem is indeed here:

if (!new_topic_filter) {
    /* Set new_tf to old_tf so it's easier to check against the existing node.
     * Basically, it's an INOUT param. */
    new_topic_filter = old_topic_filter;

    /* Search all subtopics until we find one that isn't current. */
    aws_hash_table_foreach(
        &parent->subtopics, s_topic_node_string_finder, (void *)&new_topic_filter);

    /* This would only happen if there is only one topic in subtopics (current's) and
     * it has no children (in which case it should have been removed above). */
    AWS_ASSERT(new_topic_filter != old_topic_filter); // ***** THIS ASSERTION IS FAILING

    /* Now that the new string has been found, the old one can be destroyed. */
    aws_string_destroy((void *)current->topic_filter);
    current->owns_topic_filter = false;
}
jthoward64 commented 1 year ago

That extra info should give me a good place to focus my investigation on this, I'm thinking it might have something to do with this being the only topic in memory at a particular point? Could it have something to do with our client config?

  subscribeClientConfig =
    iot.AwsIotMqttConnectionConfigBuilder.new_with_websockets({
      region: AWS_REGION,
      credentials_provider,
    })
      .with_client_id(`ec2-test-sub-${Date.now() + 1}`)
      .with_endpoint(AWS_IOT_ENDPOINT)
      .with_clean_session(true)
      // .with_will(new mqtt.MqttWill(`$aws/things/${client_id}/shadow/update`, 0, ''))
      // .with_keep_alive_seconds(30)
      .build();

Or is this more likely a race condition/obscure error somewhere?

jthoward64 commented 1 year ago

Seems like we found the root cause in our code. The crash happens when we subscribe to $aws/things/${serialNumber}/shadow/update/accepted and unsubscribe from $aws/things/${serialNumber}/shadow/update. When we unsub from $aws/things/${serialNumber}/shadow/update/+ instead it seems to fix the issue. I think this stems from an assumption the underlying code makes about parent/child topic relations but I'm not sure. Probably the best fixes would be either better error messaging for it or adding some extra checks.

xiazhvera commented 1 year ago

Seems like we found the root cause in our code. The crash happens when we subscribe to $aws/things/${serialNumber}/shadow/update/accepted and unsubscribe from $aws/things/${serialNumber}/shadow/update. When we unsub from $aws/things/${serialNumber}/shadow/update/+ instead it seems to fix the issue. I think this stems from an assumption the underlying code makes about parent/child topic relations but I'm not sure. Probably the best fixes would be either better error messaging for it or adding some extra checks.

Thank you so much for diving into the root cause!! It seems that a better error handling should solve the issue. I'm currently looking into it on SDK side. I will update any further progress here.

xiazhvera commented 1 year ago

Hi @jthoward64, again thank you so much for helping us find the issue! The fix is in code review and will take couple days to release. You can checkout the fix at https://github.com/awslabs/aws-c-mqtt/pull/297 and test locally if you are interested.

Meanwhile, the issue happens when we tries to unsubscribe from a non-subscribed parent topic. Before the release is out, you can avoid the issue by only unsubscribe the topics you have subscribed. As an example:

Subscribe("test/parent/child")
Subscribe("test/parent/#")
Unsubscribe("test/parent/child")    // valid
Unsubscribe("test/parent/#")         // valid

Unsubscribe("test/parent")           // <= seg fault here

Besides, unfortunately, unsubscribe is doing "character-by-character" string match on service side (Mqtt311 spec [MQTT-3.10.4-1] ), which means you could not unsubscribe all child-topics at once by using the wildcard. You can only use wildcard to unsubscribe if you have subscribed to a topic with wildcard.

An Invalid unsubscribe Example with wildcard

// ExampleA 
Subscribe("test/a") 
Subscribe("test/b") 
Unsubscribe("test/#").               // Though the unsubscribe will succeed, the server will do nothing.
                                     // You will NOT unsubscribe from either "test/a" or "test/b". 
                                     // Please note the fix we updated will only fix the segfault here, 
                                     // it will not change the server behavior

A valid unsubscribe example with wildcard

Subscribe("test/#")                    // You should be able to receive publish on topic "test/a" or "test/b" or any child topic of "test/"
Subscribe("test/a")
Unsubscribe("test/#)"                  // You will no longer receive publish on other "test/***" but you will still get message for "test/a"

I will update here when the fix is released. Feel free to let me know if you have any further questions.

bretambrose commented 1 year ago

I believe this has been addressed in v1.15.18 of the CRT and/or v1.13.2 of the v2 SDK.