ewpa / LibSSH-ESP32

Libssh SSH client & server port to ESP32 Arduino library
https://www.ewan.cc/node/157
Other
254 stars 36 forks source link

A stack overflow in task loopTask has been detected #26

Closed AleksandrBraun closed 1 year ago

AleksandrBraun commented 1 year ago

Hi all. I'm trying to learn this library for my ESP32 project, in which I'll be trying to send an executable command to a Debian server to shut it down with an external command. While I stopped at the stage of connecting to the server. I have created two methods. In the first one, I took from the author's tutorial and there I create a new connection session. At this stage, everything goes well. in the second method, I go directly to the connection and see the process in the terminal, but then the process crashes and the controller reboots with an error ***ERROR*** A stack overflow in task loopTask has been detected. It is the line: rc = ssh_connect( my_ssh_session ); causes a crash. Does anyone have the same problem and how to "fix" it?

In a complete listing of attempts from the beginning of the command to failure.

SERIAL INPUT:   ssh         // This is my terminal command to start session and connect
Create session success.     // This is my comment in the terminal about the start of the session
[1970/01/01 00:00:05.460373, 2] ssh_connect:  libssh 0.10.4 (c) 2003-2022 Aris Adamantiadis, Andreas Schneider and libssh contributors. Distributed under the LGPL, please refer to COPYING file for information about your rights, using threading threads_noop
[1970/01/01 00:00:05.475210, 2] ssh_socket_connect:  Nonblocking connection socket: 54
[1970/01/01 00:00:05.479886, 2] ssh_connect:  Socket connecting, now waiting for the callbacks to work
[1970/01/01 00:00:05.500125, 1] socket_callback_connected:  Socket connection callback: 1 (0)
[1970/01/01 00:00:05.528570, 2] ssh_client_connection_callback:  SSH server banner: SSH-2.0-OpenSSH_7.9p1 Debian-10+deb10u2
[1970/01/01 00:00:05.528825, 2] ssh_analyze_banner:  Analyzing banner: SSH-2.0-OpenSSH_7.9p1 Debian-10+deb10u2
[1970/01/01 00:00:05.538287, 2] ssh_analyze_banner:  We are talking to an OpenSSH server version: 7.9 (70900)
***ERROR*** A stack overflow in task loopTask has been detected.
abort() was called at PC 0x40088a38 on core 1

ELF file SHA256: 0000000000000000

Backtrace: 0x400887a4:0x3ffaf740 0x40088a21:0x3ffaf760 0x40088a38:0x3ffaf780 0x4008a61b:0x3ffaf7a0 0x4008c1e4:0x3ffaf7c0 0x4008c19a:0x00000000

Rebooting...

Thank you for answer.

P.S. I am using ESP32 WROOM Dev kit. 1 with 4MB (32Mb) ( if it matters )

playmiel commented 1 year ago

"***ERROR*** A stack overflow in task loopTask has been detected"usually is due to an overload of the void loop() or a blocking of a core that prevents the loop from working, need more details of your code to help you

AleksandrBraun commented 1 year ago

"***ERROR*** A stack overflow in task loopTask has been detected"usually is due to an overload of the void loop() or a blocking of a core that prevents the loop from working, need more details of your code to help you

There is no code in my main loop other than checking for the presence of data in the UART. Perhaps I do not understand something in the process of connecting to the server. Now I connect to it using Putty I enter the IP address and port 22 A window opens and prompts you to enter a login Then it asks for a password After that, the input prompt string from SSH is displayed and I can enter requests and commands. How is the process of connection and authorization in this library, because it doesn’t even come to authorization. Everything breaks on connection.

ewpa commented 1 year ago

Firstly please retry with libssh verbosity turned off. All the debug printfs use a lot of stack space. The next step would be to either change the loop task stack size or -- more simply -- put your code into a new task with a larger stack. Let me know how you get on without the debug output.

AleksandrBraun commented 1 year ago

Thank you ewpa for your answer/ Thank you for your work, I can imagine how much time was spent on such a product... :) I created a minimal project to test the library step by step. I always do this with new knowledge. I go from simple to complex, tying the stages with the output to the terminal of the result of the previous stage of the code execution. I will try to put here code snippets that I have already gone through and what I am stuck on.

This is my permanent debug block that I insert into all projects and it helps to briefly enter the output of the necessary data into the terminal. And then I just turn it off with the preprocessor. The code will contain inserts DPRINT[ LN | F ] just from it.

/////// Debug construction ///////
#define DEBUG   1
#if DEBUG
#define DPRINT(...) Serial.print(__VA_ARGS__)
#define DPRINTLN(...)   Serial.println(__VA_ARGS__)
#define DPRINTF(...)    Serial.printf(__VA_ARGS__)
#else
#define DPRINT(...)
#define DPRINTLN(...)
#define DPRINTF(...)
#endif //DEBUG
/////// END Debug construction ///////

Next is the library connection block, including yours. And your variables ( temporarily I made them global )

#include <WiFi.h>
#include "SPIFFS.h"
/////// SSH //////
#include "libssh_esp32.h"
#include <libssh/libssh.h>
ssh_session my_ssh_session;
int verbosity = SSH_LOG_PROTOCOL;
int port = 22, rc = -1;
/////// SSH //////

block setup - nothing more

void setup() {

#if DEBUG
    Serial.begin( 115200 );
#endif
    WiFi.mode( WIFI_STA );
    WiFi.begin( "ssid_name", "net_password");
    while ( WiFi.status() != WL_CONNECTED ) {
        DPRINT( F( "." ) );
        delay( 250 );
    }
    DPRINTLN();

    // Initialize SPIFFS
    if ( !SPIFFS.begin() ) {
        DPRINTLN( "An Error has occurred while mounting SPIFFS" );
        SPIFFS.format();
    }

    DPRINT( "IP Address: " );
    DPRINTLN( WiFi.localIP() );
}

Further in the loop, I "listen" to the serial port for the "ssh" command and, having received it, I start the connection process.

void loop() {

#if DEBUG
    if ( Serial.available() > 0 ) {
        serialParse();
    }
#endif

    delay( 10 );
}

void serialParse( void ) {

    String str = "";
    char c;
    while ( Serial.available() > 0 ) {
        c = char( Serial.read() );
        if ( ( c != '\n' ) && ( c != '\r' ) ) {
            str += c;
            delay( 2 );
        }
        if ( c == '\n' ) Serial.read();
        if ( c == '\r' ) Serial.read();
    }

    if ( str == "" ) return;
    DPRINTF( "SERIAL INPUT:\t%s\r\n", str.c_str() );

    if ( str == "ssh" ) {
        createConnection();
        return;
    }

    DPRINTLN( F( "INCORRECT INPUT" ) );

}

Next are your methods from the examples on github

void createConnection( void ) {

    my_ssh_session = ssh_new();
    if ( my_ssh_session == NULL ) {
        DPRINTLN( "No session" );
        exit( -1 );
    }

    ssh_options_set( my_ssh_session, SSH_OPTIONS_HOST, "192.168.0.222" );
    ssh_options_set( my_ssh_session, SSH_OPTIONS_LOG_VERBOSITY, &verbosity );
    ssh_options_set( my_ssh_session, SSH_OPTIONS_PORT, &port );

    DPRINTLN( "Create session success" );
    DPRINTLN( "Try connect to server" );
    connectToServer( my_ssh_session );
    DPRINTLN( "Connect to server success" );

    DPRINTLN( "Try verify host" );
    verify_knownhost( my_ssh_session );
    DPRINTLN( "Vverify host success" );

    ssh_disconnect( my_ssh_session );
    DPRINTLN( "Disonnect from server success" );
    ssh_free( my_ssh_session );
    DPRINTLN( "Fre resource" );

}

void connectToServer( ssh_session my_ssh_session ) {

    rc = ssh_connect( my_ssh_session );     // <- the controller reboot happens here.
    if ( rc != SSH_OK ) {
        fprintf( stderr, "Error connecting to %s: %s\n",
            "192.168.0.222",
            ssh_get_error( my_ssh_session ) );
        DPRINTLN( "Connect fault." );
        ssh_free( my_ssh_session );
        exit( -1 );
    }

}

int verify_knownhost( ssh_session session ) {

    enum ssh_known_hosts_e state;
    unsigned char* hash = NULL;
    ssh_key srv_pubkey = NULL;
    size_t hlen;

    DPRINTLN( "Get server pub key" );
    rc = ssh_get_server_publickey( session, &srv_pubkey );
    if ( rc < 0 ) {
        return -1;
    }

    DPRINTLN( "Get server pub hash" );
    rc = ssh_get_publickey_hash( srv_pubkey,
                 SSH_PUBLICKEY_HASH_SHA1,
                 &hash,
                 &hlen );
    ssh_key_free( srv_pubkey );

    DPRINTLN( "SSH key free" );
    ssh_key_free( srv_pubkey );
    if ( rc < 0 ) {
        return -1;
    }

    state = ssh_session_is_known_server( session );

    DPRINTF( "State:\t%d\n", state );

    ssh_clean_pubkey_hash( &hash );
    return 0;
}

I'm probably asking a lot, but I would be very happy for help with this library. Thank you very mach.

P.S. Yes, I disabled verbose logging. It did not affect the behavior of the program. But it is no longer visible at what stage the crash occurs. It just reboots.

ewpa commented 1 year ago

Hi Alex,

I have increased the stack size to 50kB and the code moves past that error now. Please refer to branch github-issue-26 where I have pushed an example named overflow.ino with the fix.

It does error much later in the code, so over to you now.


Create session success
Try connect to server
[1970/01/01 00:00:16.982635, 2] ssh_connect:  libssh 0.10.4 (c) 2003-2022 Aris Adamantiadis, Andreas Schneider and libssh contributors. Distributed under the LGPL, please refer to COPYING file for information about your rights, using threading threads_noop
[1970/01/01 00:00:16.996927, 2] ssh_socket_connect:  Nonblocking connection socket: 48
[1970/01/01 00:00:17.002126, 2] ssh_connect:  Socket connecting, now waiting for the callbacks to work
[1970/01/01 00:00:17.011995, 1] socket_callback_connected:  Socket connection callback: 1 (0)
[1970/01/01 00:00:17.021390, 2] ssh_client_connection_callback:  SSH server banner: SSH-2.0-OpenSSH_8.4p1 Debian-5+deb11u1
[1970/01/01 00:00:17.030103, 2] ssh_analyze_banner:  Analyzing banner: SSH-2.0-OpenSSH_8.4p1 Debian-5+deb11u1
[1970/01/01 00:00:17.039775, 2] ssh_analyze_banner:  We are talking to an OpenSSH server version: 8.4 (80400)
[1970/01/01 00:00:17.051768, 1] ssh_known_hosts_read_entries:  Failed to open the known_hosts file '/etc/ssh/ssh_known_hosts': No such file or directory
[1970/01/01 00:00:17.068227, 2] ssh_kex_select_methods:  Negotiated curve25519-sha256,ssh-ed25519,aes256-gcm@openssh.com,aes256-gcm@openssh.com,aead-gcm,aead-gcm,none,none,,
[1970/01/01 00:00:17.141083, 2] ssh_init_rekey_state:  Set rekey after 4294967296 blocks
[1970/01/01 00:00:17.141632, 2] ssh_init_rekey_state:  Set rekey after 4294967296 blocks
[1970/01/01 00:00:17.146284, 2] ssh_packet_client_curve25519_reply:  SSH_MSG_NEWKEYS sent
[1970/01/01 00:00:17.154017, 2] ssh_packet_newkeys:  Received SSH_MSG_NEWKEYS
[1970/01/01 00:00:17.180666, 2] ssh_packet_newkeys:  Signature verified and valid
Connect to server success
Try verify host
Get server pub key
Get server pub hash
SSH key free
CORRUPT HEAP: Bad head at 0x3ffd4c24. Expected 0xabba1234 got 0x3ffcb1cc

assert failed: multi_heap_free multi_heap_poisoning.c:253 (head != NULL)

Backtrace: 0x40083795:0x3fff85f0 0x4008d2d1:0x3fff8610 0x400928e1:0x3fff8630 0x40092553:0x3fff8760 0x40083c05:0x3fff8780 0x40092911:0x3fff87a0 0x400dbb6e:0x3fff87c0 0x400d30e6:0x3fff87e0 0x400d316b:0x3fff8810 0x400d3241:0x3fff8830 0x400d328f:0x3fff8870```

Thanks,
Ewan.
AleksandrBraun commented 1 year ago

Thanks a lot. I guessed that the connection process should take place in another thread, as a separate task. You are awesome. It's late today, I'll definitely try your version tomorrow morning.

AleksandrBraun commented 1 year ago

Hi ewpa. thank you again. Everything worked out for me as soon as I realized that the process should be in another, parallel task. Connecting, authorizing and sending executable tasks were successful.

The problem can be closed.