obgm / libcoap

A CoAP (RFC 7252) implementation in C
Other
791 stars 423 forks source link

libcoap 4.3.0 build with libcoap #1180

Closed fun-works closed 3 months ago

fun-works commented 1 year ago

libcoap version 4.3.0

I am trying to build libcoap with LwIP + FreeRTOS for a ST device I am getting the following error when I am trying to configure and build with USE_CUSTOM_POOL in LwIP. include/lwip/stats.h:271:26: error: 'MEMP_MAX' undeclared here (not in a function)

Following is the include stack:

In file included from /home/projects/playground1/Component/lwip/src/lwip/src/include/lwip/netif.h:50,
                 from /home/projects/playground1/Component/lwip/src/lwip/src/include/lwip/sockets.h:47,
                 from /home/projects/playground1/build/debug/_deps/libcoap-src/include/coap3/net.h:24,
                 from /home/projects/playground1/build/debug/_deps/libcoap-src/include/coap3/async.h:20,
                 from /home/projects/playground1/build/debug/Component/stack/src/stack/_libcoap_EP-prefix/src/_libcoap_EP-build/include/coap3/coap.h:42,
                 from /home/projects/playground1/build/debug/_deps/libcoap-src/include/coap3/coap_internal.h:38,
                 from /home/projects/playground1/Component/lwip/src/porting/user/lwippools.h:14,
                 from /home/projects/playground1/Component/lwip/src/lwip/src/include/lwip/priv/memp_std.h:142,
                 from /home/projects/playground1/Component/lwip/src/lwip/src/include/lwip/memp.h:49,
                 from /home/projects/playground1/Component/lwip/src/lwip/src/api/api_lib.c:63:
/home/projects/playground1/Component/lwip/src/lwip/src/include/lwip/stats.h:271:26: error: 'MEMP_MAX' undeclared here (not in a function)
  271 |   struct stats_mem *memp[MEMP_MAX];

Can you please help me understand how I can configure the pool and its options to run it successfully ?

Do I need any special commit of LwIP or am I missing any options ?

mrdeep1 commented 1 year ago

As mentioned previously, 4.3.0 is somewhat old and things have moved on since then, including updates to LwIP. I would recommend you try the latest libcoap develop branch to see if there are in any issues there.

MEMP_MAX is defined in the LwIP source in src/include/lwip/memp.h.

For 4.3.0, the supported lwip branch is STABLE-2_0_3_RELEASE and the supported lwip-contrib branch is STABLE-2_0_1_RELEASE.

For the latest libcoap code, it is (lwip) STABLE-2_1_3_RELEASE and (lwip-contrib) STABLE-2_1_0_RELEASE.

With the correct lwip branches, the 4.3.0 and develop code build with no errors, but they use their own variant of lwipopts.h and lwippools.h.

fun-works commented 1 year ago

Ok, So I have updated to latest libcoap now. And I am trying to build with FreeRTOS with:

#define NO_SYS                     0
#define SYS_FREERTOS                       1

And I am getting following error:

/home/projects/playground1/build/debug/_deps/libcoap-src/src/coap_subscribe.c:60:3: warning: implicit declaration of function 'COAP_MUTEX_DEFINE' [-Wimplicit-function-declaration]
   60 |   COAP_MUTEX_DEFINE(e_static_mutex);
      |   ^~~~~~~~~~~~~~~~~
/home/projects/playground1/build/debug/_deps/libcoap-src/src/coap_subscribe.c:60:21: error: 'e_static_mutex' undeclared (first use in this function)
   60 |   COAP_MUTEX_DEFINE(e_static_mutex);

Seems like I need to define COAP_MUTEX_DEFINE, but not sure where ?

fun-works commented 1 year ago

well, I added the line:

#define COAP_MUTEX_DEFINE(_name)                        \
  static sys_mutex_t _name

@coap_mutex_internal.h:63 and it is successful. But I have some other errors now.

But I think mutex is is missing on libcoap for now and you need to fix that.

mrdeep1 commented 1 year ago

Thanks for your help troubleshooting here.

But I think mutex is is missing on libcoap for now and you need to fix that.

1181 Raised for this.

error: 'coap_layer_func_t' has no member named 'lwip_write'

Looks like there is a #define write lwip_write somewhere. I will go through and create a PR so that we do not get name clashes when using things like write.

The conversion warnings need to be checked though, but are unlikely do be causing any issues.

mrdeep1 commented 1 year ago

I will go through and create a PR so that we do not get name clashes when using things like write.

This should now be fixed in the latest version of the develop branch.

obgm commented 1 year ago

1181 Raised for this.

Is there a specific reason to use a variable name that starts with an underscore (_)? This is considered bad practice because this naming pattern is reserved for system-internal use.

mrdeep1 commented 1 year ago

Good question. This was following how all of the other COAP_MUTEXDEFINE() variants have been defined. Actual variable does not end up with a leading , but all the usage of _name can be changed.

fun-works commented 1 year ago

Thanks for your help troubleshooting here.

But I think mutex is is missing on libcoap for now and you need to fix that.

1181 Raised for this.

error: 'coap_layer_func_t' has no member named 'lwip_write'

Looks like there is a #define write lwip_write somewhere. I will go through and create a PR so that we do not get name clashes when using things like write.

The conversion warnings need to be checked though, but are unlikely do be causing any issues.

Yes, that was somewhere else in my project to have posix compatible calls. I fixed by disabling the option, however it can also be avoided like you mentioned. Thanks.

fun-works commented 1 year ago

Btw, is there any plan to have a release with all these lwip changes ? And is it possible to provide cmake support as well for lwip ?

mrdeep1 commented 1 year ago

Btw, is there any plan to have a release with all these lwip changes ?

We are shortly going to be releasing 4.3.2 release candidate 1 (4.3.2rc1) which includes the LwIP changes.

And is it possible to provide cmake support as well for lwip ?

It just needs someone to create the appropriate cmake files for building LwIP with libcoap and submit a PR. I see that in LwIP STABLE-2_1_3_RELEASE is a CMakeLists.txt.

fun-works commented 1 year ago

I updated to the latest libcoap and lwip to 2.1.3. Still it is crashing on coap_net.c:473: c = coap_malloc_type(COAP_CONTEXT, sizeof(coap_context_t));

I am trying to run it using FreeRTOS on embedded device.

I have set MEMP_USE_CUSTOM_POOL to 1 already. Do I need to implement anything for memory pool ? Any hints please ?

mrdeep1 commented 1 year ago

Is it that c is getting returned as NULL, or is it crashing somewhere in coap_malloc_type() / memp_malloc() ?

In examples/lwip/config/lwipopts.hthe libcoap example for LwIP has #define MEMP_USE_CUSTOM_POOLS 1 , and the (libcoap) memory pools are defined in examples/lwip/config/lwippools.h . I guess you need to check which lwippools.h is getting included.

fun-works commented 1 year ago

my mem_pools[] is like: image

Index 19 is supposed to be for COAP_CONTEXT.

mrdeep1 commented 1 year ago

Interesting - I would be expecting the addresses to be sequentially incrementing, so it looks like it is not picking up LWIP_MEMPOOL(COAP_CONTEXT, MEMP_NUM_COAPCONTEXT, sizeof(coap_context_t), "COAP_CONTEXT") from your final lwippools.h.

I get (under linux) from the libcoap built client executable

(gdb) p memp_pools
$1 = {0x4412e0, 0x441320, 0x441360, 0x4413a0, 0x4413e0, 0x441420, 0x441460, 0x4414a0, 0x4414e0, 0x441520, 0x441560, 0x4415a0, 
  0x4415e0, 0x441620, 0x441660, 0x4416a0, 0x4416e0, 0x441720, 0x441760, 0x4417a0}
(gdb) p memp_COAP_CONTEXT
$5 = {desc = 0x441548 "COAP_CONTEXT", stats = 0x64e930, size = 1056, num = 1, base = 0x658e40 "", tab = 0x64e948}
(gdb) p/x &memp_PBUF
$1 = 0x4414e0
(gdb) p/x &memp_PBUF_POOL
$2 = 0x441520
(gdb) p/x &memp_COAP_CONTEXT
$7 = 0x441560
fun-works commented 1 year ago

Yes, may be I have some pool configuration issues. I will check this on my side. Thanks

fun-works commented 1 year ago

Ok, I checked on my code and I had an empty lwippools.h empty and I corrected it to use the one from libcoap. Now it is coming like: image

However, as you can see the *tab is 0. and causing the allocation to fail. Am I missing anything further ? Could you please help here ?

fun-works commented 1 year ago

I think I am trying to initialize libcoap before lwip. Let me fix this first.

mrdeep1 commented 1 year ago

Yes, lwip_init() should be called before coap_startup() before coap_new_context().

fun-works commented 1 year ago

Ok, I fixed it now. However I am not able to respond. I mean I can receive a GET request. But it does not respond anything. My implementation for libcoap 4.3.0 has not changed. It works with Linux though. Also I can only receive a request only twice and no request is received after that. I can ping the device, means it has not crashed. Any hints on this ?

Following is my response PDU formed: image

fun-works commented 1 year ago

Update: If I comment lock and unlock in udp send as below, I can send out the responses: image

I think this is a nested lock getting locked here which is already locked. I am analyzing the same.

But still I can only receive two messages ?

mrdeep1 commented 1 year ago

It looks like LOCK_TCPIP_CORE() was invoked in coap_io_process(), which then timed out and called coap_io_process_timeout(), which then called coap_io_prepare_io(), which then tried to send out an unsolicited observe response, or async delayed response which called coap_socket_send().

This could have happened after 2 responses. The whole locking up of TCPIP_CORE needs to be reviewed.

fun-works commented 1 year ago

Interesting, because I am processing them in piggyback way and there should not be a delay in responding.

I tried to debug the 2 request reception issue, I found coap_io_lwip.c:245 line: session = coap_endpoint_get_session(ep, packet, now);

returning NULL, we have coap sessions count as 2. Am I supposed to free the coap session anywhere or is it missing in libcoap anywhere ?

mrdeep1 commented 1 year ago

The example lwip-server code has coap_context_set_max_idle_sessions(main_coap_context, MEMP_NUM_COAPSESSION -1); included to force idle sessions to be cleaned up in coap_endpoint_get_session(), which leaves space for one new incoming session.

An idle session is defined as the reference count == 0 and there is nothing to be sent in the delay queue.

mrdeep1 commented 1 year ago

Are you able to get this to work now?

rpati12 commented 12 months ago

Yes, except the following two stuffs:

  1. I still have the local patch for LOCK_TCPIP_CORE()
  2. Device is not able to do a multicast response. It is getting delayed by libcoap as I could see by debugging, but no idea where it is going after that. I need to debug further to get into details.
mrdeep1 commented 12 months ago

The multicast response is deliberately delayed as per RFC7252 Section 8.2 Request/Response Layer, which uses the async logic - hence the LOCK_TCPIP_CORE() fix you are doing in coap_socket_send().

I will have a look at this.

rpati12 commented 12 months ago

After getting delayed, this response never comes out. I will debug further on this.

The Lock thing, yes, I am currently commented out the Lock/Unlock in the socket_send() call.

But, we can close this issue for now. If I find any issue in multicast send, I will create another issue. Makes sense ?

mrdeep1 commented 12 months ago

I'm not sure how you are setting up multicast support in the server as the standard code does not have support for this for LWIP. It would be good to understand what changes you are making here,

mrdeep1 commented 12 months ago

But the multicast sorting out should be on a separate Issue.

rpati12 commented 12 months ago

yes

fun-works commented 3 months ago

For now we can close this issue as we are able to build and do basic communication. Thanks for your support.