RIOT-OS / RIOT

RIOT - The friendly OS for IoT
https://riot-os.org
GNU Lesser General Public License v2.1
4.87k stars 1.98k forks source link

TCP lwIP Error Connecting Sock Problem on ESP32 #11910

Closed mmaxus35 closed 5 years ago

mmaxus35 commented 5 years ago

@gschorcht @PeterKietzmann I switch to lwIP and modified the offical TCP example the code is here. I run ESP32 802.11 board.

I have TCP Server in Java, this is a TCP client. I also modified esp_wifi_params.h to my AP credentials just in case. I am sure that ipv6 is valid because udp example runs OK using same ipv6 address. However, Error is given in if (sock_tcp_connect(&sock, &remote, 0, 0) < 0) { puts("Error connecting sock"); return 1; }

Here. Also it connects to my Access point i am sure that because i am able to ping the ESP32.

What it could be the reason behind this?

#include "net/af.h"
#include "net/ipv6/addr.h"
#include "net/sock/tcp.h"
#include "xtimer.h"
uint8_t buf[128];

/* import "ifconfig" shell command, used for printing addresses */
extern int _gnrc_netif_config(int argc, char **argv);

int main(void)
{

    /* print network addresses */
    puts("Configured network interfaces2:");
    _gnrc_netif_config(0, NULL);

    xtimer_sleep(5);
    sock_tcp_t sock;    
    printf("****TCP CLIENT*3*3*3*");
    int res;
    sock_tcp_ep_t remote = SOCK_IPV6_EP_ANY;
    remote.port = 12345;
    ipv6_addr_from_str((ipv6_addr_t *)&remote.addr,
                       "fe80::4c2c:2f99:9ae8:73cb");
    if (sock_tcp_connect(&sock, &remote, 0, 0) < 0) {
        puts("Error connecting sock");
        return 1;
    }
    puts("Sending \"Hello!\"");
    if ((res = sock_tcp_write(&sock, "Hello!", sizeof("Hello!"))) < 0) {
        puts("Errored on write");
    }
    else {
        if ((res = sock_tcp_read(&sock, &buf, sizeof(buf),
                                 SOCK_NO_TIMEOUT)) < 0) {
            puts("Disconnected");
        }
        printf("Read: \"");
        for (int i = 0; i < res; i++) {
            printf("%c", buf[i]);
        }
        puts("\"");
    }
    sock_tcp_disconnect(&sock);
    return res;
}

I have added necessary modules also,

# name of your application
APPLICATION = tcp-simple-server
# If no BOARD is found in the environment, use this default:
BOARD ?= esp32-wroom-32

# This has to be the absolute path to the RIOT base directory:
RIOTBASE ?= $(CURDIR)/../..

BOARD_INSUFFICIENT_MEMORY := arduino-duemilanove arduino-leonardo \
                             arduino-mega2560 arduino-nano \
                             arduino-uno blackpill bluepill calliope-mini \
                             chronos hifive1 i-nucleo-lrwan1 mega-xplained \
                             microbit msb-430 msb-430h \
                             nucleo-f031k6 nucleo-f042k6 nucleo-f303k8 \
                             nucleo-l031k6 nucleo-f030r8 nucleo-f070rb \
                             nucleo-f072rb nucleo-f103rb nucleo-f302r8 \
                             nucleo-f334r8 nucleo-l053r8 saml10-xpro \
                             saml11-xpro spark-core stm32f0discovery \
                             stm32l0538-disco telosb \
                             waspmote-pro wsn430-v1_3b wsn430-v1_4 z1

# Add also the shell, some shell commands
USEMODULE += auto_init_gnrc_netif 
USEMODULE += gnrc_netdev_default

USEMODULE += gnrc_icmpv6_echo

USEMODULE += lwip lwip_ipv6_autoconfig lwip_sock_ip lwip_netdev
USEMODULE += lwip_udp lwip_sock_udp
USEMODULE += lwip_tcp lwip_sock_tcp
USEMODULE += shell
USEMODULE += shell_commands
USEMODULE += ps
USEMODULE += netstats_l2
USEMODULE += netstats_ipv6
USEMODULE += netstats_rpl

USEMODULE += ipv6_addr
USEMODULE += od
USEMODULE += netdev_default
USEMODULE += esp_wifi 
DEVELHELP ?= 1
# Change this to 0 show compiler invocation lines by default:
QUIET ?= 1

include $(RIOTBASE)/Makefile.include

# Set a custom channel if needed
ifneq (,$(filter cc110x,$(USEMODULE)))          # radio is cc110x sub-GHz
  DEFAULT_CHANNEL ?= 0
  CFLAGS += -DCC110X_DEFAULT_CHANNEL=$(DEFAULT_CHANNEL)
else
  ifneq (,$(filter at86rf212b,$(USEMODULE)))    # radio is IEEE 802.15.4 sub-GHz
    DEFAULT_CHANNEL ?= 5
    CFLAGS += -DIEEE802154_DEFAULT_SUBGHZ_CHANNEL=$(DEFAULT_CHANNEL)
  else                                          # radio is IEEE 802.15.4 2.4 GHz
    DEFAULT_CHANNEL ?= 26
    CFLAGS += -DIEEE802154_DEFAULT_CHANNEL=$(DEFAULT_CHANNEL)
  endif
endif
miri64 commented 5 years ago

I remember @gschorcht mentioning something about problems with ESP32 + the lwIP pkg, as the ESP SDK also uses lwIP for some internal stuff. I summon him for details ;-)

miri64 commented 5 years ago

(I took the freedom to edit OP for better readability)

gschorcht commented 5 years ago

@miri64

I took the freedom to edit OP for better readability

Thanks, @mmaxus35 has still problems with the markup language :wink:

Yes, the current ESP8266 port has problems with the lwIP package. That's why all ESP8266 boards are blacklisted. BTW, with the complete reimplementation of the ESP8266 port in PR #11108 which is waiting for review, it will not be a problem any longer.

But, there shouldn't be any problem with ESP32 and lwIP. The problem seems not to be related to the ESP32 board. According to @mmaxus35, the board has WiFi connectivity. He can ping the board and the UDP lwIP socket example works.

@mmaxus35 Are you sure that your TCP server is running on fe80::4c2c:2f99:9ae8:73cb and is listening at port 12345? Can you see an according entry when you use command

netstat -tulpen

on this machine?

mmaxus35 commented 5 years ago

(I took the freedom to edit OP for better readability)

Thanks for your help.

@miri64

I took the freedom to edit OP for better readability

Thanks, @mmaxus35 has still problems with the markup language 😉

Yes, the current ESP8266 port has problems with the lwIP package. That's why all ESP8266 boards are blacklisted. BTW, with the complete reimplementation of the ESP8266 port in PR #11108 which is waiting for review, it will not be a problem any longer.

But, there shouldn't be any problem with ESP32 and lwIP. The problem seems not to be related to the ESP32 board. According to @mmaxus35, the board has WiFi connectivity. He can ping the board and the UDP lwIP socket example works.

@mmaxus35 Are you sure that your TCP server is running on fe80::4c2c:2f99:9ae8:73cb and is listening at port 12345? Can you see an according entry when you use command

netstat -tulpen

on this machine?

Of course, fe80::4c2c:2f99:9ae8:73cb is running in Java in my Windows PC adnd i can send packets using another programs like "Packet Sender" and they are received and messages displayed to screen. Here is the java code that i use

 import java.io.*;
import java.net.*;

class TCPServer
{
         String capitalizedSentence;
   public static void main(String argv[]) throws Exception
      {
         String clientSentence;
         String capitalizedSentence;
         ServerSocket welcomeSocket = new ServerSocket(12345);

         while(true)
         {
            Socket connectionSocket = welcomeSocket.accept();
            BufferedReader inFromClient =
               new BufferedReader(new InputStreamReader(connectionSocket.getInputStream()));
            DataOutputStream outToClient = new DataOutputStream(connectionSocket.getOutputStream());
            clientSentence = inFromClient.readLine();
            System.out.println("Received: " + clientSentence);
            capitalizedSentence = clientSentence.toUpperCase() + '\n';
            outToClient.writeBytes(capitalizedSentence);
         }
      }
}

By the way these are the warnings when i compile the program

/data/riotbuild/RIOT-master/examples/tcp-simple-server/bin/esp32-wroom-32/lwip_api.a(tcpip.o): In function `tcpip_try_callback':
tcpip.c:(.text+0x32c): warning: undefined reference to `sys_mbox_trypost_fromisr'
/data/riotbuild/RIOT-master/examples/tcp-simple-server/bin/esp32-wroom-32/lwip_api.a(tcpip.o): In function `tcpip_send_msg_wait_sem':
tcpip.c:(.text+0x35e): warning: undefined reference to `sys_mbox_trypost_fromisr

@gschorcht @miri64

gschorcht commented 5 years ago

By the way these are the warnings when i compile the program

/data/riotbuild/RIOT-master/examples/tcp-simple-server/bin/esp32-wroom-32/lwip_api.a(tcpip.o): In function `tcpip_try_callback':
tcpip.c:(.text+0x32c): warning: undefined reference to `sys_mbox_trypost_fromisr'
/data/riotbuild/RIOT-master/examples/tcp-simple-server/bin/esp32-wroom-32/lwip_api.a(tcpip.o): In function `tcpip_send_msg_wait_sem':
tcpip.c:(.text+0x35e): warning: undefined reference to `sys_mbox_trypost_fromisr

BTW, I was trying to investige these undefined symbols a year ago, but I don't remeber what the problem was.

mmaxus35 commented 5 years ago

By the way these are the warnings when i compile the program

/data/riotbuild/RIOT-master/examples/tcp-simple-server/bin/esp32-wroom-32/lwip_api.a(tcpip.o): In function `tcpip_try_callback':
tcpip.c:(.text+0x32c): warning: undefined reference to `sys_mbox_trypost_fromisr'
/data/riotbuild/RIOT-master/examples/tcp-simple-server/bin/esp32-wroom-32/lwip_api.a(tcpip.o): In function `tcpip_send_msg_wait_sem':
tcpip.c:(.text+0x35e): warning: undefined reference to `sys_mbox_trypost_fromisr

BTW, I was trying to investige these undefined symbols a year ago, but I don't remeber what the problem was.

Thanks for your reply. In conclusion, I couldn't manage to run a TCP application using neither lwip nor gnrc in ESP32. As you mentioned about new release, will ESP8266 able to run a TCP application? I am trying to catch up the TCP application to my thesis and conferance paper.

gschorcht commented 5 years ago

These undefined symbols should not be a problem. sys_mbox_trypost_fromisr doesn't seem to be used, otherwise your application would crash on the first time when the function is called.

When I was investigating this undefined symbols problem a year ago, I was already wondering why the linker is complaining about them even though they are not required. At that time I figured out that these symbols are not in the binary on other platforms. On other platform also function tcpip_callbackmsg_trycallback_fromisr which is calling sys_mbox_trypost_fromisr isn't in the binary. But, on ESP32 it is. I will try to add -ffunction-sections compiler options so that not used function can be optimized out.

mmaxus35 commented 5 years ago

So, what should i do now to run TCP application in ESP32 or ESP8266. Should i wait or do you have any suggestions. Because i am sure that there is something which causes ESP32 not to run TCP and when i add this compilier options will it be run?

gschorcht commented 5 years ago

@miri64 I tried @mmaxus35's TCP client with gnrc_sock_tcp according to the documentation of the TCP sock API. But there seems to be no implementation of this API for TCP and GNRC. Am I right or did I miss something?

miri64 commented 5 years ago

Yeah, gnrc_tcp is still missing an implementation for sock_tcp :(. It is on @brummer-simon's TODO list though (after GNRC's TCP API itself was cleaned up AFAIK). And yes, due to our focus on UDP-based communication in the past, I have to admit that TCP+RIOT is in a bad state in general :(. However, just using the plain gnrc_tcp API should work (but I always would recommend lwIP over GNRC for TCP traffic as it is far more mature than GNRC in that regard atm).

mmaxus35 commented 5 years ago

@miri @gschorcht , Thanks for you attention and answers. I tried lwIP for TCP and i also share the app code and makefile that i switch to lwIP from gnrc. TCP can compile and flash to ESP32, however" Error connecting sock" is occured.

gschorcht commented 5 years ago

@miri64 Unfortunatly, I'm not familar with lwIP. In fact, I have never used it :worried:

I tried tests/lwip with module esp_wifi, but I don't see any network interface. When I was looking for the reason, I saw that there are pseudomodules lwip_ethernet and lwip_sixlowpan which control the compilation in pkg/lwip/contrib/netdev.c. So I guess that lwIP uses the network devices directly in this test and GNRC is not involved. In this case we would need to extend this file so that esp_wifi can be used directly by lwIP, right?

So my question is, is there any example or test which demonstrates on how to use lwIP over GNRC?

miri64 commented 5 years ago

I tried tests/lwip with module esp_wifi, but I don't see any network interface. […] In this case we would need to extend this file so that esp_wifi can be used directly by lwIP, right?

This is correct. You need to amend lwip_bootstrap() for esp_wifi first. [edit]As I assume that esp_wifi exposes itself via netdev as a normal ethernet device I assume the rest should be already there in lwIP[/edit].

So my question is, is there any example or test which demonstrates on how to use lwIP over GNRC?

Nope. I'm not sure this even would work, as at some points we assume that there is only one network stack. Maybe somehow we can adapt lwip_netdev.c to use gnrc_netif instead, if we want to go that direction. From the GNRC side at least it would be possible.

gschorcht commented 5 years ago

This is correct. You need to amend lwip_bootstrap() for esp_wifi first. [edit]As I assume that esp_wifi exposes itself via netdev as a normal ethernet device I assume the rest should be already there in lwIP

@miri64 Thanks. I will try it.

gschorcht commented 5 years ago

@miri64 Finally, I was able to get lwIP working with esp_wifi. However, before I open a PR, I have some questions and I would really appreciate to get some hints from you.

  1. Since there is only a single esp_wifi_netdev_t, I just added

    +#ifdef MODULE_ESP_WIFI
    +#define LWIP_NETIF_NUMOF        (1)
    +#endif

    and

    +
    +#elif defined(MODULE_ESP_WIFI)
    +    esp_wifi_setup(&_esp_wifi_dev);
    +    if (netif_add(&netif[0], &_esp_wifi_dev, lwip_netdev_init,
    +                  tcpip_input) == NULL) {
    +        DEBUG("Could not add esp_wifi device\n");
    +        return;
    +    }

    in lwip_bootstrap instead of having something like:

    for (unsigned i = 0; i < LWIP_NETIF_NUMOF; i++) {
        esp_wifi_setup(&esp_wifi_devs[i], &esp_wifi_params[i]);
        if (netif_add(&netif[i], &esp_wifi_devs[i], lwip_netdev_init,
                      tcpip_input) == NULL) {
            DEBUG("Could not add socket_zep device\n");
            return;
        }
    }

    Would that be OK for you?

  2. With this kind of initialization, the application can only have exactly one esp_wifi interface. Furthermore, if module esp_wifi is used, only this one esp_wifi interface can be used with lwip.

    What if the board would also have an ethernet port like the esp32-olimex-evb board? I guess, it's a general problem of the way how LWIP_NETIF_NUMOF is defined in pkg/lwip/contrib/lwip.c

    Should I define LWIP_NETIF_NUMOF for esp_wifi depending on whether it is already defined by other netdev drivers? What do you think?

gschorcht commented 5 years ago

@mmaxus35 I could get your tcp-simple-server working with the changes described above (I will provide a PR). However, you must not mix GNRC and lwIP in you application. Either you use GNRC or you use lwIP. That is, you must not use gnrc_* modules in your makefile or gnrc_* functions in your application.

Your application works with the changes described and the following modules:

# module as used in tests/lwip 
USEMODULE += lwip lwip_ipv6_autoconfig lwip_sock_ip lwip_netdev
USEMODULE += lwip_tcp lwip_sock_tcp
USEMODULE += ipv6_addr
USEMODULE += shell
USEMODULE += shell_commands
USEMODULE += ps
USEMODULE += od
USEMODULE += netdev_default

# additional modules for the application
USEMODULE += netstats_l2
USEMODULE += netstats_ipv6
USEMODULE += netstats_rpl
USEMODULE += esp_wifi

I had a netcat running as TCP server on remote machine using with command nc -6l 12345 and could receive the string Hello! from ESP32.

mmaxus35 commented 5 years ago

@gschorcht Thank you so much. I will return my vacation after 2 days and try it ASAP then give feedback about it. Thank you.

When your PR is accepted and when i can start using the fixed version?

gschorcht commented 5 years ago

@mmaxus35 You have to wait for the PR to have access to the changes described for pkg/lwip/contrib/lwip.c.

mmaxus35 commented 5 years ago

@gschorcht Thanks. I wait for changes. Can it be done in a day?.

gschorcht commented 5 years ago

@mmaxus35 Sure, I could, but I would like to wait for the answers to my questions from @miri64 before.

mmaxus35 commented 5 years ago

@gschorcht Thanks to you i can add these features to my thesis and conferance paper. You did big help to me.

@miri64

gschorcht commented 5 years ago

@miri64 Any comments on my questions in https://github.com/RIOT-OS/RIOT/issues/11910#issuecomment-515695972? I woul like to provide a PR.

miri64 commented 5 years ago

Mh... I was sure I already replied to that ... Sorry. Yes, this is how it should be implemented at the current state. Regarding multi-interface I don't really have a solution. It looks like lwIP is with that regard in an even more dire state than GNRC. My hope is that @jia200x's efforts to generalize the network interfaces to a stack-independent concept will get rid of this problem. If it becomes an issue before that, then we should try to find a solution. What do you think?

gschorcht commented 5 years ago

@miri64 Thanks for your answers.

Yes, this is how it should be implemented at the current state. Regarding multi-interface I don't really have a solution.

Ok, I will provide a PR doing that way.

It looks like lwIP is with that regard in an even more dire state than GNRC. My hope is that @jia200x's efforts to generalize the network interfaces to a stack-independent concept will get rid of this problem. If it becomes an issue before that, then we should try to find a solution. What do you think?

Especially for ESPs, the multi-interface application problem is not a real problem. It is probably completely sufficient for most ESP applications to use lwIP together with WiFi access. Therefore, in most applications no additional network interface is required. If somebody needs more than one network interface, it is still possible to use GNRC.

gschorcht commented 5 years ago

@mmaxus35 You could try PR #11946. To checkout use

git fetch upstream pull/11946/head:pr/11946
OritGeron commented 5 years ago

please be advised that you are sending to Miri miri@gizra.com

gschorcht commented 5 years ago

please be advised that you are sending to Miri miri@gizra.com

@mmaxus35 Please take care to use @miri64 instead of @miri, happened in https://github.com/RIOT-OS/RIOT/issues/11910#issuecomment-515490710

mmaxus35 commented 5 years ago

@gschorcht , I will return home after 2 days and at first i will try the example then give a feedback. Thanks for your help. I will carefully referance about miri's. [off topic-BTW, i sent an email you.]

mmaxus35 commented 5 years ago

@gschorcht I successfully build the client code and run the program and saw Hello! message on server side. Thanks. However the following TCP Server code which i get from https://riot-os.org/api/groupnetsock__tcp.html gives sta_disconnect problem. I use the same makefile and build steps as in Client Code. In server side, there might be a problem to cause sta_disconnect event. In below i also share the code.

@gschorcht UPDATE For the client code above, i take the code in" while loop "and comment out the "return res" to achieve continious message sending, there might be another issue because even if there is no delay inside while loop, TCP Messages sends 17 seconds after, periodically.

#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
#include "thread.h"
#include "xtimer.h"
#include "od.h"
#include "net/af.h"
#include "net/sock/tcp.h"
#include "net/ipv6.h"
#include "shell.h"
#include "net/af.h"
#include "net/sock/tcp.h"
#define SOCK_QUEUE_LEN  (1U)
static sock_tcp_t sock_queue;
uint8_t buf[128];

/* import "ifconfig" shell command, used for printing addresses */
//extern int _gnrc_netif_config(int argc, char **argv);

int main(void)
{
    sock_tcp_ep_t local = SOCK_IPV6_EP_ANY;
    sock_tcp_queue_t queue;
    local.port = 12345;
    if (sock_tcp_listen(&queue, &local, &sock_queue, 1, 0) < 0) {
        puts("Error creating listening queue");
        return 1;
    }
    puts("Listening on port 12345");
  /* 

    puts("Configured network interfaces2:");
    _gnrc_netif_config(0, NULL);    

*/
    while (1) {
//   puts("Configured network interfaces:");
//  _gnrc_netif_config(0, NULL);
        sock_tcp_t *sock;
        if (sock_tcp_accept(&queue, &sock, SOCK_NO_TIMEOUT) < 0) {
            puts("Error accepting new sock");
        }
        else {
            int read_res = 0;
            puts("Reading data");
            while (read_res >= 0) {
                read_res = sock_tcp_read(sock, &buf, sizeof(buf),
                                         SOCK_NO_TIMEOUT);
                if (read_res < 0) {
                    puts("Disconnected");
                    break;
                }
                else {
                    int write_res;
                    printf("Read: \"");
                    for (int i = 0; i < read_res; i++) {
                        printf("%c", buf[i]);
                    }
                    puts("\"");
                    if ((write_res = sock_tcp_write(sock, &buf,
                                                    read_res)) < 0) {
                        puts("Errored on write, finished server loop");
                        break;
                    }
                }
            }
            sock_tcp_disconnect(sock);
        }
    }
    sock_tcp_stop_listen(&queue);
    return 0;
}
gschorcht commented 5 years ago

@mmaxus35 Cannot reproduce your problem. Everything works fine with the TCP server:

I (635) [main_trampoline]: main(): This is RIOT! (Version: 2019.10-devel-192-gcd93c-cpu/esp32/lwip_netdev)
Listening on port 12345
I (3048) [      wifi]: n:8 0, o:1 0, ap:255 255, sta:8 0, prof:1
I (4795) [      wifi]: state: init -> auth (b0)
I (4803) [      wifi]: state: auth -> assoc (0)
I (4879) [      wifi]: state: assoc -> run (10)
I (4907) [      wifi]: connected with BSHS1, channel 8
I (4908) [      wifi]: pm start, type: 1

I (4908) [     event]: system_event_sta_connected_handle_default
Reading data
Read: "Hello
"
Disconnected

But if you see somthing like

I (7939) [     event]: system_event_sta_disconnected_handle_default

instead of

I (4908) [     event]: system_event_sta_connected_handle_default

there is simply a problem with your WiFi connection. Either the access data are wrong or the WiFi signal is too bad. The application cannot trigger a disconnect from WiFi AP in any case.

mmaxus35 commented 5 years ago

@gschorcht According to your results i am sure that there is a problem on my side. I will give feedback about this. BTW how about tcp high message sending time about 17 seconds to establish and send message without any delay in the systems?, Another, question is there any way to print IP adress of ESP32 in lwIP ?

UPDATE: I successfully run the server and find a way to print ip adress. But high message sending time still might be a problem.