espressif / esp-thread-br

Espressif Thread Border Router SDK
Apache License 2.0
93 stars 18 forks source link

Setting default netif to Openthread breaks libraries (TZ-914) #71

Open motters opened 1 month ago

motters commented 1 month ago

Hi,

The following line in the example border router breaks libraries that require network access: ESP_ERROR_CHECK(esp_netif_set_default_netif(openthread_netif)); https://github.com/espressif/esp-thread-br/blob/48b2e47db318a8e7dfdb0d7afaa2b295139814f5/examples/common/thread_border_router/src/border_router_launch.c#L116

For example when trying to implement LwM2M using Anjay the stack can no longer access the LwM2M server. I'm assuming it's because it's using the openthread interface and not WiFi / Ethernet.

Commenting out this line fixes things and also doesn't seem to break the Openthread BR. I'm also not sure why we'd want openthead to be the default netif.

Hence my question of: is setting openthread to the default netif required for any features?

Cheers!

zwx1995esp commented 1 month ago

Hi @motters For the default netif set, we configurated it to openthread just want to tell the lwip if it can not determine a specific packet to forward on some interfaces, just forward it to openthread interface. So the issue might be why the lwip can not forward the packet to access the LwM2M server on WiFi/Ethernet interface? Could you please share more details on this issue?

motters commented 1 month ago

Hi @zwx1995esp,

Thanks for the help.

Sure no problems, but I'm not sure what else to share.

We've effectively merged Anjay ESP32 LwM2M + ESP-Thread-BR + some custom unrelated firmware.

When ESP_ERROR_CHECK(esp_netif_set_default_netif(openthread_netif)); is included we get the following errors from Anjay:

E (43956) lwm2m: ERROR [avs_net] [./components/anjay-esp-idf/deps/anjay/deps/avs_commons/src/net/compat/posix/avs_net_impl.c:1268]: cannot establish connection to [LWM2M-URL.COM]:5684
E (43970) lwm2m: ERROR [avs_net] [./components/anjay-esp-idf/deps/anjay/deps/avs_commons/src/net/mbedtls/../avs_ssl_common.h:155]: avs_net_socket_connect() on backend socket failed
E (43983) lwm2m: ERROR [anjay] [./components/anjay-esp-idf/deps/anjay/src/core/servers/anjay_connection_ip.c:130]: could not connect to LWM2M-URL.COM:5684 37766.4

When removed we get no errors from Anjay or OT BR. However we've obviously haven't tested all Thread BR features.

Regarding the errors:

avs_net_impl.c

https://github.com/AVSystem/avs_commons/blob/2885ea16a2662c3b4ea2c245f3eac408c02e9d73/src/net/compat/posix/avs_net_impl.c#L1268

image

which seems to eventually call int error = getaddrinfo(host, NULL, &hint, &ctx->results); in a sub-function https://github.com/AVSystem/avs_commons/blob/2885ea16a2662c3b4ea2c245f3eac408c02e9d73/src/net/compat/posix/avs_compat_addrinfo.c#L239

avs_ssl_common.h

image
wnylei commented 4 weeks ago

我也遇到了同样的问题,它在这里 #74 。
当我屏蔽掉 ESP_ERROR_CHECK(esp_netif_set_default_netif(openthread_netif)); 这行代码后,mqtt连接就能恢复正常。
我感觉它有可能是一个bug ?

motters commented 4 weeks ago

I also encountered the same problem, it is here #74. When I mask the line ESP_ERROR_CHECK(esp_netif_set_default_netif(openthread_netif));, the mqtt connection can be > > restored to normal. I feel it may be a bug?

I would agree.

However these libraries really should have a function that allows you to bind the correct interface. Instead of them only allowing the default interface to be used.

For example Openthread allows you to do this:

image

Though, changing every library isn't possible so a work around is needed here.

zwx1995esp commented 4 weeks ago

Hi, @motters and @wnylei . Thanks for raising and discussing the issue to us. For a look at your both issues, can I abstract your issues like when BR running both WiFi and Thread, with calling the function esp_netif_set_default_netif, the BR device can not access the cloud address(maybe call getaddrinfo can not return the address from the DNS server).

So, @motters am I right? and @wnylei is your issue the same with this?

wnylei commented 4 weeks ago

嗨,还有.感谢您向我们提出和讨论这个问题。为了了解您的两个问题,我是否可以抽象您的问题,例如当 BR 同时运行 WiFi 和 Thread 时,调用该函数,BR 设备无法访问云地址(也许调用无法从 DNS 服务器返回地址)。esp_netif_set_default_netif``getaddrinfo

那么,我说得对吗?你的问题和这个一样吗?

我觉得DNS是可以正常工作的,我打开debug日志后,显示是可以解析到正确的主机IP的。 部分相关日志如下

D (44273) esp-tls: [sock=57] Resolved IPv4 address: 82.157.23.88
D (44283) esp-tls: [sock=57] Connecting to server. HOST: gateway-test.afnsmart.com, Port: 9883
E (44283) esp-tls: [sock=57] connect() error: Host is unreachable
E (44293) esp-tls: Failed to open new connection
E (44303) transport_base: Failed to open a new connection
E (44313) mqtt_client: Error transport connect
D (44313) event: running post MQTT_EVENTS:0 with handler 0x400f0ac8 and context 0x3ffe714c on loop 0x3ffe7090
0x400f0ac8: u_mqtt_event_handler at /Users/wanghm/Code/nl/esp/thread-esp-project/projects/normal_br/components/u_common_mqtt/src/u_common_mqtt.c:42
wnylei commented 4 weeks ago

我也遇到了同样的问题,它在这里#74。当我屏蔽行 ESP_ERROR_CHECK(esp_netif_set_default_netif(openthread_netif));时,mqtt 连接可以> >恢复正常。我觉得这可能是一个错误?

我同意。

但是,这些库确实应该具有允许您绑定正确接口的功能。而不是他们只允许使用默认接口。

例如,Openthread 允许您执行以下操作: 图像

但是,不可能更改每个库,因此这里需要解决方法。

是的,对于嵌入式设备,如果要为每个应用层协议,都指定一个网络接口,那确实是不太现实和必要的。 我认为具体使用哪个接口,应该是网络底层根据一定的策略自动抉择的。 不过,我目前更想知道增加 esp_netif_set_default_netif(openthread_netif) 这行代码的意义在哪里。 当我把它屏蔽掉之后,经过简单的测试,发现thread网络和wifi网络都可以很好的工作。 当然,我还没有进行全部功能的测试。

zwx1995esp commented 4 weeks ago

Hi @motters , sync some conclusion from #74 : There is a preliminary speculation for this issue: The LWIP layer does not know which netif to establish the TCP session for the IPv4 address of the cloud. If the esp_netif_set_default_netif is not set, the default netif will be wifi, and this address will be placed on the wifi side to establish TCP. However, after adding this esp_netif_set_default_netif, it may run to the thread side to establish and fail. Let me reproduce it and debug it later.