obgm / libcoap

A CoAP (RFC 7252) implementation in C
Other
800 stars 424 forks source link

Async GET for Observe resource causes core dump #306

Closed leenowell closed 5 years ago

leenowell commented 5 years ago

Hi All,

I modified the example client and server "Hello World" example to make the async GET resource"Observable". The initial GET works fine but then having set the resource as dirty and called coap_check_notify() it core dumps. It looks like the reason is that handler receives a NULL request pointer and the call to coap_register_async calls coap_transaction_id which ultimately derefences the NULL pointer.

I have tried checking for NULL request and artificially creating a request PDU to send to the coap_register_async call. This prevents the core dump but the client then rejects it (I assume because the id is unknown to it). On the server I get the following message "ALRT got RST for message X" and no errors on the client. I have tried the following and none work

  1. storing the ID of the original request (i.e. request->hdr->id) and setting the ID of temp request PDU
  2. setting the ID of the temp request PDU to 0
  3. creating a new ID of for the temp request PDU Using this technique, the client receives the initial response then 12 updates and nothing else received.

The async handler is (sorry not sure why the code won't render correctly)

`static void async_handler(coap_context_t ctx, struct coap_resource_t resource, const coap_endpoint_t local_interface, coap_address_t peer, coap_pdu_t request, str token, coap_pdu_t *response) { if (request == NULL) { ESP_LOGI(TAG, "Request is NULL so creating one ID[%d]", nRequestID);

    request = coap_new_pdu();
    if (request)
    {
        uint8_t     get_method = 1;
        request->hdr->type = COAP_MESSAGE_CON;
        request->hdr->id   = nRequestID; //coap_new_message_id(ctx);
        request->hdr->code = get_method;
        unsigned char buf[3]; 
        coap_add_option(request, COAP_OPTION_OBSERVE, coap_encode_var_bytes(buf, COAP_OBSERVE_ESTABLISH), buf); 

        //coap_add_option(request, COAP_OPTION_URI_PATH, uri.path.length, uri.path.s);
    }
    else
        return;
}
else
    nRequestID = request->hdr->id;

async = coap_register_async(ctx, peer, request, COAP_ASYNC_SEPARATE | COAP_ASYNC_CONFIRM, (void*)"no data");

} `

Is this a bug or am I doing something wrong?

thanks

Lee.

obgm commented 5 years ago

Handlers for observable resources may be called with the request parameter set to NULL. This tells a handler that there is no actual request but the handler has been called because the observed resource has been marked dirty. As a consequence, when you want to register a request for a separate response, you will need to make a copy of the original request and pass it to coap_register_async(). But as notifications are separate responses per definition, you will never have any reason to do that, anyway.

leenowell commented 5 years ago

Thanks @obgm for getting back to me. I understand your comments about the request being NULL and it makes sense to me. Unfortunately, I didn't quite follow the second part of your answer - sorry. Are you saying that on the initial get of an observed resource I need to store the request such that then if the handler is called with a NULL (i.e. the resource has been updated) I can add the saved request to the call? If I am reading the last sentence correctly, do you mean that I shouldn't call coap_register_async() for the notification anyway? If so, what should I do? I guess maybe the bit I am missing is without the request, how does it know where to send the response to. Sorry if I am being dozy!

Do you have an example of this scenario by any chance?

thanks again

Lee.

mrdeep1 commented 5 years ago

The async code is primarily to test out the asynchronous responses inherent within the CoAP protocol.

It is unclear as to what you are expecting to get sent back from the coap-server with an observe triggered response. When the async process completes (delayed by the defined amount of time), it simply sends back a "done" message. For example for coap-client -v9 coap://127.0.0.1/async?3

Mar 04 12:40:31 DEBG *  127.0.0.1:42090 <-> 127.0.0.1:5683 UDP : sent 12 bytes
v:1 t:CON c:GET i:736a {} [ Uri-Path:async, Uri-Query:3 ]
Mar 04 12:40:31 DEBG *  127.0.0.1:42090 <-> 127.0.0.1:5683 UDP : received 4 bytes
v:1 t:ACK c:0.00 i:736a {} [ ]
Mar 04 12:40:34 DEBG *  127.0.0.1:42090 <-> 127.0.0.1:5683 UDP : received 9 bytes
v:1 t:CON c:2.05 i:736a {} [ ] :: 'done'

With the current hnd_get_async() in coap-server function, if called by an observe trigger (happens once per sec with current code, assuming the resource for hnd_get_async() is set as observeable), then as there is not a request from the client per-se, the request variable will be NULL. However, response will have been set up and you would be expected to populate it at a minimum with the token and a response code before returning from the function. Data may be useful to add in, but this data needs to be defined by you based on the current observe triggered response. For an observeable call to hnd_get_async() you should not set up another coap_register_async() as this async has already been registered.

Then at some point in the future (based on the delay count), the "done" message will get sent.

leenowell commented 5 years ago

Thanks for the explanation. I think between the 2 the responses I think I see the gap in my understanding so thanks both for that. I had incorrectly assumed I needed the request to determine where to send the response to and the token. Also, I had assumed once you started with async all responses had to go the same way. I have now updated my get handler to do the following which seems to work (although for some reason the first 4 calls get a RST on the server). First call does the Asych response and then subsequent ones are updates when the resource is marked dirty.

static void
async_handler(coap_context_t *ctx, struct coap_resource_t *resource,
              const coap_endpoint_t *local_interface, coap_address_t *peer,
              coap_pdu_t *request, str *token, coap_pdu_t *response)
{
    if (request == NULL)
    {
        unsigned char buf[3];
        char response_data[50];
        sprintf(response_data, "Hello Obs [%d]", nNumbUpdates);

        response = coap_pdu_init(COAP_MESSAGE_CON, COAP_RESPONSE_CODE(205), 0, COAP_MAX_PDU_SIZE);
        response->hdr->id = coap_new_message_id(ctx);
        coap_add_token(response, sizeof(token), token);
        coap_add_option(response, COAP_OPTION_OBSERVE, coap_encode_var_bytes(buf, COAP_OBSERVE_ESTABLISH), buf); // LEE ADDED
        coap_add_option(response, COAP_OPTION_CONTENT_TYPE, coap_encode_var_bytes(buf, COAP_MEDIATYPE_TEXT_PLAIN), buf);

        ESP_LOGE(TAG,"Responding with [%s] length [%d]", (char*) response_data, strlen(response_data));
        coap_add_data  (response, strlen(response_data)+1, (unsigned char *)response_data);

        if (coap_send(ctx, local_interface, peer, response) == COAP_INVALID_TID) {

        }
    }
    else
        async = coap_register_async(ctx, peer, request, COAP_ASYNC_SEPARATE | COAP_ASYNC_CONFIRM, (void*)"no data");
}

My overall scenario is that I have a number of devices which are sending updates to this device using a different mechanism. What I would like to happen is that this node then sends these updates automatically to the observing clients. My logic was to send the initial GET and return the async ack from the server. Then as the new messages come in, I flag the resource as dirty and then it sends the updates to the observing clients.

Having said this, having now got the hard coded message sent back OK, I am now unclear how I would get the data into the handler to then add to the response message. I could use a queue or something but unclear when to take the message off the queue. So have a few questions

  1. Is there a better way of achieving this?
  2. If I have several observers, will the handler be called once and then send the same message to each observer or will it be called once per observer? If the latter, how will I know when to remove the message from the queue

I have seen some detailed documentation on the web describing the protocol and the API guide for libcoap but was wondering whether there was any other broader documentation containing the sort of information you both have provided above.

Thanks again for your help

Lee.

mrdeep1 commented 5 years ago

I guess that I am confused by the parameters you have defined for async_handler() - it looks like you are not using the latest master / develop / 4.2.0 code.

Any callback handler for a resource request will always have the response PDU already initialized (whether triggered by an actual GET (by handle_request()) or Observe trigger (by coap_notify_observers())) and when the callback returns, the response PDU is either forwarded on or dropped (primarily on whether response->code is set or not for handle_request()).

So, in your case, you should NOT be creating a new PDU and sending it - update response PDU as appropriate.

I do not think using async is appropriate here - primarily set up for testing async delayed responses. I think that you should be modelling things based on hnd_get() in examples/coap-server.c which then returns the updated value to all of the observing clients. man coap_observe(3) may hep you here as well.

leenowell commented 5 years ago

Hi,

Thanks for your reply - very helpful. I am running this on an ESP32 and the version of libcoap which is distributed with it is

git describe --tags
v4.1.1-401-g6468887

When you say the response PDU is already initialised, does that mean that all the options etc. are automatically set and all I need to do is add the data? Also I don't need to explicitly call coap_send()?

I guess for my initial GET I could just return a "no data" response synchronously and then go from there. Digging around a bit I found that I could add attributes to the resource using coap_add_attr() before setting it as dirty. Was thinking that I could use this to send the necessary data to the get handler to enable it to create the response (I will need to know the device ID and the data string sent from the sending device). Is this right technique?

Finally, neither of the following works on my system

man coap_observe -  returns "No manual entry for coap_observe"
man coap_observe(3) - returns "bash: syntax error near unexpected token `(

I believe I have followed the installation instructions using autogen.sh but this was a few weeks ago so may have missed something,

Thanks once again for all your help it has made a big difference to my understanding,

Lee.

mrdeep1 commented 5 years ago

Your code version is old - I suggest that you go with the current develop / master / 4.2.0 code where a lot of things have been addressed that you may stumble into if possible. Your version does not have the man pages - I suggest you go to https://libcoap.net/doc/reference/4.2.0/manpage.html and look there.

Also I don't need to explicitly call coap_send()?

Correct - all this is handled for you by the caller of the get handler callback. You just need, at a minimum, to set response->code.

coap_add_attr() is the wrong thing to use - it is used for describing attributes for the resource which can be looked up by using (for example) coap-client coap://127.0.0.1/.well-known/core.

How you store the date you send off to the observe clients is up to you - it could be a static variable that observer responses pick up and send, or it could be in a static list keyed by the resource name if there are to be multiple variables monitored etc. hnd_get() in examples/coap-server in 4.2.0 or later does a dynamic resource lookup to get the right data to send back to the client. The dynamic resource lists are set up elsewhere in the code.

leenowell commented 5 years ago

Thanks for this, the man pages will be a big help. I will upgrade tonight - hopefully it is compatible with the ESP-IDF I have :)

The overall paradigm I am trying to replicate is sort of like observing a list of items such that if one updates it sends only that update to the observers. In this case, I would either need to send the update to the handler or the unique ID such that the handler can retrieve the data. I thought this scenario would be achieved using the resource name to get the list and then using query syntax to access individual items if needed. Is this incorrect?

Clearly in my specific example there is additional complexity around there is no actual data store to retrieve the data from and the requests come the other way but the paradigm is still the same.

Thanks again

Lee.

mrdeep1 commented 5 years ago

coap_resource_t (4.2.0) has

  /**
   * This pointer is under user control. It can be used to store context for
   * the coap handler.
   */
  void *user_data;

which can be used to hold whatever your changing data is. Then whenever this data is changed, you just need to call coap_resource_notify_observers(your_resource, NULL); and then all the client observers (via your hnd_get() callback function will get updated. Any client that subscribes to observe 'your resource' will get updated.

leenowell commented 5 years ago

Ah perfect thanks... Unfortunately I am struggling to update the libcoap version that I am using as part of the ESP_IDF. Although I have dropped the new version into the same place as the old, for some reason the compiler can no longer see the header files. I have raised a question on their forum to see how to do this.

Thanks once again for your help

Lee.

obgm commented 5 years ago

You may need to adjust the include paths, i.e. -I must point to the directory containing coap2. Most likely, you need to adjust COMPONENT_ADD_INCLUDEDIRS in components/coap/component.mk.

leenowell commented 5 years ago

Thanks for this. Looks like component.mk didn't come with the new version of libcoap so have copied it across from the old one and adjusted the settings to get past the error. I now have got past most of them so hopefully a couple of more tweaks to go.

Thanks

Lee.

obgm commented 5 years ago

Correct: component.mk is part of the ESP-IDF and thus included there, not in libcoap. If you succeed with your adjustments, please contribute as PR to the ESP-IDF.

leenowell commented 5 years ago

Will do. Will take a look tonight.

leenowell commented 5 years ago

So have tried to get this working and am struggling. I took a clean version of the library, extracted it into the component/coap directory and renamed it to libcoap. I then did the following

./autogen.sh
./configure  --disable-documentation --enable-examples --disable-dtls –enable-shared
make

I had to remove the –enable-tests option as it gave errors about CUnit not found and I couldn't resolve it.

I think I have done the config wrong as I get a variety of issues so wonder whether I should be specifying the build parameter or cross compile maybe/.

  1. it is looking for syslog .h so I commented out the define in config.h
  2. Including the header files from coap2 async.c fails as debug.h is not found
  3. Including the header files from coap I get the following error message /opt/esp/esp-idf/components/coap/libcoap/include/coap/address.h:102:19: error: unknown type name 'coap_address_t' coap_address_init(coap_address_t *addr) { Which to be honest I don't understand as coap_address_t is typedef'd at the top of the file.

I assume there is some conflict between the lib being built on Ubuntu and the project that uses is build for ESP32?

obgm commented 5 years ago

Which version of libcoap do you use? There should not be any config.h or debug.h.

Which definition of coap_address_t do you think is being used? Your configure invocation suggests that you are building for POSIX but for ESP32 you will most likely want to have the LWiP port. You might want to take a look into examples/lwip/Makefile for a working LWiP configuration.

leenowell commented 5 years ago

So I went to code, selected develop, picked release-4.2.0 and downloaded the zip file. I then extracted that into the components directory that is part of the IDF structure and then did the above. So... I believe 4.2.0 but maybe I did something wrong? I wonder if there are legacy files from the old version somewhere although I did try an auto-generated --clean.

For my situation, I was wondering whether I should be setting the build / cross compile options on the configure to tell it that I don't want the code generated for the host OS but for ESP - the fact that the #define to use syslog was pointing me in this line of thinking

obgm commented 5 years ago

Sorry, I just noted that the esp-idf sets WITH_POSIX and brings its own socket adaptation. So, it might be possible to go for the POSIX variant. Anyway, you do not have to (and should not run) configure because everything you need to configure is already provided in components/coap/port (you may need to add some things, though).

leenowell commented 5 years ago

Ah ok will give it a try without config. From memory though before I run it I believe some of the header files didnt exist and we .h.in or something. The port directory was in the old version but not the new one. Does that mean I should copy that across too? Sorry not at my pc to check properly.

obgm commented 5 years ago

Yes, you will need to copy/keep/adjust from components/coap the following files and directories:

CMakeLists.txt
component.mk
Makefile.projbuild
port

And then clone libcoap as subdirectory libcoap there as well.

In addition to COMPONENT_ADD_INCLUDEDIRS you will have to adjust COMPONENTS_OBJS in component.mk. (And you should hopefully be able to remove the flags that suppress the compiler warnings.)

leenowell commented 5 years ago

Hi,

I have updated as you suggested but it seems like initially when I extract everything and do autogen the include files are in include/coap2 and notably no debug.h. I then try to make my coap server project and somehow it seems to generate a bunch of files in include/coap and notably a debug.h file. It seems that some of the libcoap header and source files #include debug.h.

File list in coap is


address.h  coap.h.in    encode.h     mem.h     prng.h       uri.h
async.h    coap_io.h    hashkey.h    net.h     resource.h   uthash.h
bits.h     coap_time.h  libcoap.h    option.h  str.h        utlist.h
block.h    debug.h      lwippools.h  pdu.h     subscribe.h

File list in coap2 is

address.h     coap_dtls.h     coap_io.h       lwippools.h  prng.h       uthash.h
async.h       coap_event.h    coap_session.h  mem.h        resource.h   utlist.h
bits.h        coap_hashkey.h  coap_time.h     net.h        str.h
block.h       coap.h.in       encode.h        option.h     subscribe.h
coap_debug.h  coap.h.windows  libcoap.h       pdu.h        uri.h

This is the output of "diff --brief" on the coap vs coap2 directory. Seems like a bunch of files are very different between the 2.

Files coap/address.h and coap2/address.h differ
Files coap/async.h and coap2/async.h differ
Files coap/bits.h and coap2/bits.h differ
Files coap/block.h and coap2/block.h differ
Only in coap2: coap_debug.h
Only in coap2: coap_dtls.h
Only in coap2: coap_event.h
Only in coap2: coap_hashkey.h
Files coap/coap.h.in and coap2/coap.h.in differ
Only in coap2: coap.h.windows
Files coap/coap_io.h and coap2/coap_io.h differ
Only in coap2: coap_session.h
Files coap/coap_time.h and coap2/coap_time.h differ
Only in coap: debug.h
Files coap/encode.h and coap2/encode.h differ
Only in coap: hashkey.h
Files coap/libcoap.h and coap2/libcoap.h differ
Files coap/lwippools.h and coap2/lwippools.h differ
Files coap/mem.h and coap2/mem.h differ
Files coap/net.h and coap2/net.h differ
Files coap/option.h and coap2/option.h differ
Files coap/pdu.h and coap2/pdu.h differ
Files coap/prng.h and coap2/prng.h differ
Files coap/resource.h and coap2/resource.h differ
Files coap/str.h and coap2/str.h differ
Files coap/subscribe.h and coap2/subscribe.h differ
Files coap/uri.h and coap2/uri.h differ
Files coap/uthash.h and coap2/uthash.h differ
Files coap/utlist.h and coap2/utlist.h differ

Adding just coap2 gives compile errors looking for debug.h. adding both coap2 and coap gives a load of conflicts. I assume we are not expecting the coap directory to be generated? Does debug.h get generated during the compile normally?

Definitely one step closer

Thanks

Lee.

obgm commented 5 years ago

No. Just remove anything from your old libcoap version that comes with the esp-idf and clone libcoap-4.2.0 as described before. Do not autogenerate anything. The files you need are already there. If some file wants to include debug.h from libcoap, just fix that include directive (try including coap_debug.h instead). There is no debug.h any more in libcoap.

mrdeep1 commented 5 years ago

I had a quick look last night. The issue (in part) is down to the git submodule logic in esp-idf pulling in the old version of libcoap and hence the ongoing confusion.

I am looking at how easy it is to bring esp-idf up to the 4.2.0 standard - I could have something ready by Monday. The biggest change will be to port/coap_io_socket.c, as well as /examples/protocols/coap_clent and /examples/protocols/coap_server.

obgm commented 5 years ago

In the long run this would require mbedTLS integration for libcoap :-)

mrdeep1 commented 5 years ago

.. a step at a time !

leenowell commented 5 years ago

Ah thanks.... I have some time tonight to look at this so if there is anything you want me to check / do / validate, please let me know.

mrdeep1 commented 5 years ago

@leenowell If you go to ~/esp/esp-idf/components/coap/libcoap and then do git checkout release-4.2.0 , that should get you the correct version of libcoap in your build tree. Then in ~/esp/esp-idf, install this patch esp-idf.txt on a clean environment as patch -p1 < esp-idf.txt which then enables libcoap to be built.

esp-idf.txt

I have build coap_client and coap_server esp-idf examples, but have not been able to run the code on my build system.

If this works for you, I can then create a PR for esp-idf.

leenowell commented 5 years ago

Unfortunately, I fell at the first hurdle. git checkout release-4.2.0 fails with

error: pathspec 'release-4.2.0' did not match any file(s) known to git.

I tried git checkout -b release-4.2.0 obgm/libcoap

and got

4.2.0 obgm/libcoap
fatal: Cannot update paths and switch to branch 'release-4.2.0' at the same time.
Did you intend to checkout 'components/coap/libcoap/obgm/libcoap' which can not be resolved as commit?

After a bit of googling I then tried this


lee@leelaptop:/opt/esp/esp-idf/components/coap/libcoap$ git checkout -b release-4.2.0
D   components/coap/CMakeLists.txt
D   components/coap/Makefile.projbuild
D   components/coap/component.mk
D   components/coap/port/coap_io_socket.c
D   components/coap/port/include/coap/coap.h
D   components/coap/port/include/coap_config.h
D   components/coap/port/include/coap_config_posix.h
Switched to a new branch 'release-4.2.0'
lee@leelaptop:/opt/esp/esp-idf/components/coap/libcoap$ ls
lee@leelaptop:/opt/esp/esp-idf/components/coap/libcoap$ git checkout release-4.2.0
D   components/coap/CMakeLists.txt
D   components/coap/Makefile.projbuild
D   components/coap/component.mk
D   components/coap/port/coap_io_socket.c
D   components/coap/port/include/coap/coap.h
D   components/coap/port/include/coap_config.h
D   components/coap/port/include/coap_config_posix.h
Already on 'release-4.2.0'
lee@leelaptop:/opt/esp/esp-idf/components/coap/libcoap$ ls
lee@leelaptop:/opt/esp/esp-idf/components/coap/libcoap$ git fetch
remote: Enumerating objects: 2534, done.
remote: Counting objects: 100% (2534/2534), done.
remote: Compressing objects: 100% (111/111), done.
remote: Total 4104 (delta 2437), reused 2500 (delta 2413), pack-reused 1570
Receiving objects: 100% (4104/4104), 1.97 MiB | 885.00 KiB/s, done.
Resolving deltas: 100% (3111/3111), completed with 967 local objects.
From https://github.com/espressif/esp-idf
 * [new branch]      bd8733f7   -> origin/bd8733f7
 * [new branch]      d96f6d6b   -> origin/d96f6d6b
   a62cbfe..bba89e1  master     -> origin/master
   3fc3282..1b1053c  release/v3.0 -> origin/release/v3.0
   7fe18ef..cea310d  release/v3.1 -> origin/release/v3.1
   bed50a9..a7dc804  release/v3.2 -> origin/release/v3.2
 * [new branch]      release/v3.3 -> origin/release/v3.3
 * [new tag]         v3.3-beta2 -> v3.3-beta2
 * [new tag]         v3.1.3     -> v3.1.3
 * [new tag]         v3.2-beta3 -> v3.2-beta3
Fetching submodule components/bt/lib
remote: Enumerating objects: 33, done.
remote: Counting objects: 100% (33/33), done.
remote: Compressing objects: 100% (21/21), done.
remote: Total 33 (delta 22), reused 23 (delta 12), pack-reused 0
Unpacking objects: 100% (33/33), done.
From https://github.com/espressif/esp32-bt-lib
   06c3f28..48ecf82  master     -> origin/master
   bc66c9d..20feea1  release/v3.1 -> origin/release/v3.1
   f718106..1f6837b  release/v3.2 -> origin/release/v3.2
Fetching submodule components/esp32/lib
remote: Enumerating objects: 273, done.
remote: Counting objects: 100% (273/273), done.
remote: Compressing objects: 100% (11/11), done.
remote: Total 353 (delta 266), reused 267 (delta 262), pack-reused 80
Receiving objects: 100% (353/353), 2.03 MiB | 1.07 MiB/s, done.
Resolving deltas: 100% (302/302), completed with 31 local objects.
From https://github.com/espressif/esp32-wifi-lib
   4123071..61530b0  master     -> origin/master
   3a6449c..bcb6ae7  release/v3.0 -> origin/release/v3.0
   f5ce277..21ffb68  release/v3.1 -> origin/release/v3.1
   ec07b86..4a4b808  release/v3.2 -> origin/release/v3.2

Still nothing in the directory.... wonder if I have now goosed my environment :)

Any ideas?

thanks Lee.

mrdeep1 commented 5 years ago

Hmm. My mistake in that the libcoap version that you previously pulled had not been updated with branch release-4.2.0 - hence failure. git pull origin develop should have been done first. Assuming that all your testing code has been done outside of /opt/esp/esp-idf (you did something like cd /opt/esp ; cp -r $IDF_PATH/examples/protocols/coap_server then it may be best to just re-install the esp-idf git repository and then start again leaving xtensa-esp32-elf/ in place).

cd /opt/esp
rm -rf esp-idf
git clone --recursive https://github.com/espressif/esp-idf.git
cd /opt/esp/esp-idf/components/coap/libcoap/
git checkout libcoap-4.2.0
cd /opt/esp/esp-idf/
patch -p1 < <location>/esp-idf.txt             # replace <location> as appropriate
cd /opt/esp/coap_server                      # assuming this is the directory where you copied stuff
make

Alternatively you can try to recover things. The D entries certainly need to be restored as they need to be there. Then you need to revert back the branch release-4.2.0 that you created at the /opt/esp/esp-idf/components/coap/libcoap level. Then you need to restore the now missing libcoap submodule

leenowell commented 5 years ago

Aha - no worries :)

TBH - I suspect I originally just downloaded the esp-idf zip file from git originally so at least going down the re-install route will have some advantages and I will learn how to use git too. Having some trouble with the clone at the moment where it is timing out part way. Will try and sort it and update once done

leenowell commented 5 years ago

OK - finally got it to clone.

Unfortunately git checkout libcoap-4.2.0 failed with

error: pathspec 'libcoap-4.2.0' did not match any file(s) known to git.

I then tried the original command git checkout release-4.2.0 which seemed to do something


Previous HEAD position was 6468887... .gitignore: do not ignore custom files in m4
Branch release-4.2.0 set up to track remote branch release-4.2.0 from origin.
Switched to a new branch 'release-4.2.0'

It didn't seem to download anything so not sure this has done what I expected

leenowell commented 5 years ago

Assumed the patching was independent of the libcoap update, I tried it and got the following

lee@leelaptop:/opt/esp/esp-idf$ patch -p1 < ~/tmp/esp-idf.txt 
patching file components/coap/CMakeLists.txt
patching file components/coap/component.mk
File components/coap/libcoap is not a regular file -- refusing to patch
1 out of 1 hunk ignored -- saving rejects to file components/coap/libcoap.rej
patching file components/coap/port/include/coap/coap.h
patching file components/coap/port/include/coap_config_posix.h
patching file examples/protocols/coap_client/main/coap_client_example_main.c
patching file examples/protocols/coap_server/main/coap_server_example_main.c
mrdeep1 commented 5 years ago

Oops - it is getting late. Yes, release-4.2.0 is correct. The refusing to patch is OK.

All you need to do now is a test build of, say, coap_server.

leenowell commented 5 years ago

Yipppeee... Looks like it worked. I have recompiled my coap server code and now have a variety of errors which look like differences between 4.1.1 and 4.2.0. I also tried building the updated examples/protocols/coap-server and it compiled with no issues.

Given the checkout of 4.2.0 seemed to return instantly, is there a way that I can be sure it did download 4.2.0 and I have the right version?

thanks again for all your help. If you need me to run any further tests/ contribute etc. pls shout :)

mrdeep1 commented 5 years ago

In /opt/esp/esp-idf/components/coap/libcoap/, do git branch and look for the entry with a * in front of it. git diff 175a73765d4238298acdb0da427ca2742c8664c3 should have no differences.

Yes, you will need to update your coap_server because of the API changes

mrdeep1 commented 5 years ago

If you could test out if the generic coap_server and coap_client work, that would be great as I have no way of testing the executables.

leenowell commented 5 years ago

So... git branch has release-4.2.0 which looks good. The diff has a load of differences - I have pasted the first screen full below. Will give the client and server a test and let you know.

iff --git a/Makefile.am b/Makefile.am
index ed52316..165e5bd 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -233,7 +233,7 @@ distclean-local:
        @rm -f src/*.o src/*.lo
        rm -f libcoap-$(LIBCOAP_API_VERSION).map
        rm -f libcoap-$(LIBCOAP_API_VERSION).sym
-       rm -f libcoap-$(LIBCOAP_NAME_SUFFIX).pc
+       rm -f libcoap-$(LIBCOAP_API_VERSION).pc
        @echo
        @echo "     ---> Please note the following important advice! <---"
        @echo "     The files libcoap-$(LIBCOAP_API_VERSION).{map,sym} are remo$
diff --git a/examples/client.c b/examples/client.c
index 4d3e4c7..8c560e4 100644
--- a/examples/client.c
+++ b/examples/client.c
@@ -1236,9 +1236,7 @@ main(int argc, char **argv) {
   unsigned char user[MAX_USER + 1], key[MAX_KEY];
   ssize_t user_length = 0, key_length = 0;
   int create_uri_opts = 1;
-#ifndef _WIN32
   struct sigaction sa;
-#endif

   while ((opt = getopt(argc, argv, "NrUa:b:c:e:f:k:m:p:s:t:o:v:A:B:C:O:P:R:T:u$
     switch (opt) {
@@ -1460,16 +1458,12 @@ main(int argc, char **argv) {
   wait_ms = wait_seconds * 1000;
   coap_log(LOG_DEBUG, "timeout is set to %u seconds\n", wait_seconds);

-#ifdef _WIN32
-  signal(SIGINT, handle_sigint);
-#else
   memset (&sa, 0, sizeof(sa));
   sigemptyset(&sa.sa_mask);
   sa.sa_handler = handle_sigint;
   sa.sa_flags = 0;
   sigaction (SIGINT, &sa, NULL);
   sigaction (SIGTERM, &sa, NULL);
-#endif

   while (!quit && !(ready && coap_can_exit(ctx)) ) {
mrdeep1 commented 5 years ago

Grr - working on too many windows at once. git diff release-4.2.0 should be clean.

leenowell commented 5 years ago

OK after a few errors my side I have tested the example client and server. Server seems to be fine and connects to the AP ok. The client fails with the following in the log

I (1505) wifi: state: auth -> assoc (0)

I (1505) wifi: state: assoc -> run (10)

I (1555) wifi: connected with Lounge, channel 1, bssid = 00:f2:01:18:c9:c0

I (1555) wifi: pm start, type: 1

I (2215) event: sta ip: 192.168.1.232, mask: 255.255.255.0, gw: 192.168.1.1

I (2215) CoAP_client: Connected to AP

I (2215) CoAP_client: DNS lookup succeeded. IP=192.168.1.208

assertion "proto != COAP_PROTO_NONE" failed: file "/opt/esp/esp-idf/components/coap/libcoap/src/coap_session.c", line 559, function: coap_session_create_client

Checked git diff release-4.2.0 and it is indeed clean :)

leenowell commented 5 years ago

I have had a quick look at this and it seems that the split uri function in the example takes the Uri string ("coap://192 .168.1.208/Espressif") and sets proto to COAP_PROTO_NONE hence why the assertion fails. Maybe I have set the Uri to call incorrectly? I assume it should be set to TCP?

mrdeep1 commented 5 years ago

This is a better coap_client file and should correctly work with your URI change.

coap_client_example_main.txt

(rename coap_client_example_main.txt to coap_client_example_main.c !)

mrdeep1 commented 5 years ago

I have posted a PR to https://github.com/espressif/esp-idf/pull/3148 .

To get a clone of this, you will need to do a

git clone --recursive https://github.com/mrdeep1/esp-idf.git
git checkout libcoap-4.2.0

I have updated coap_server_example_main.c as well (supporting Observe)

leenowell commented 5 years ago

I have tried your updated client on the original clone (i.e. not your PR one above) and the client seems to be fine but the server immediately core dumps - not output after "connected to AP". I need to get some tooling running before I can tell where it is.. The reason is an "illegal instruction".

Do update to your PR. Do I run the above in the libcoap directory?

mrdeep1 commented 5 years ago

Do update to your PR. Do I run the above in the libcoap directory?

No, you need to do this as per

cd /opt/esp
rm -rf esp-idf
git clone --recursive https://github.com/mrdeep1/esp-idf.git
cd /opt/esp/esp-idf
git checkout libcoap-4.2.0
cd /opt/esp/esp-idf/components/coap/libcoap/
git checkout v4.2.0

Then if I make any changes to that branch, you just need to a

cd /opt/esp/esp-idf
git pull

to get them.

I would try using the new server code in the first instance at https://github.com/espressif/esp-idf/blob/bdd317fc6b9bb700692f9df7f279547293fa5184/examples/protocols/coap_server/main/coap_server_example_main.c . That said, I did not find any issues when running the code stripped of all the ESP stuff in a regular libcoap environment.

leenowell commented 5 years ago

Hi,

UPDATE - posted a couple of replies above but having spent most of the day trying to fix it I realised it was my stupidity - I had missed off the last step of upgrading libcoap!!! So have now deleted the comments :). Examples/Protocols/coap_server and coap_client build fine - sorry!

So have moved my attention to updating my code to suit the changes in 4.2.0 (as best I can given the current challenges). Looking at your updated server example, I think the improvements are great. The code is much simpler and easy to follow :). A couple of suggestions

  1. Would be good to add how to mark the resource as dirty and trigger the update to send e.g. in the PUT/ DELETE
  2. Any chance you could add the async example back in?

Comparing the old v's new, I had a few clarification questions - feel free to point me at some documents if easier - I tried the man page link you sent but couldn't find the answers

  1. Is the way to do handle a GET now to call coap_add_data_blocked_response?
  2. The original example added NON_BLOCK flag to the file descriptor is this now not required?
  3. You make a call to coap_new_endpoint before coap_resource_init. Is this now needed in the latest version?
  4. I see a call to coap_run_once in a continuous while loop. The name would imply it should be called only once?
  5. In the continuous while loop in the orginal example (actually for loop in original) it did a "select" on the FD list, worried about which file descriptor the data came in on and then did a coap_read to get the incoming request. Is none of this required any more?
mrdeep1 commented 5 years ago

Things in order of raising coap_io_socket.c is no longer required - hence delete

Keep updates on this PR for now.

I missed the following 2 lines (original comment updated) to get the correct version of libcoap

cd /opt/esp/esp-idf/components/coap/libcoap/
git checkout v4.2.0

Would be good to add how to mark the resource as dirty and trigger the update to send e.g. in the PUT/ DELETE

Done by coap_resource_notify_observers(resource, NULL); in the respective handler functions

Any chance you could add the async example back in?

Async really is not the way to resolve your requirement - its purpose is to test the logic of handling the delay of the actual response (ACK and some time later the actual response). So I am not convinced it should there anyway.

Is the way to do handle a GET now to call coap_add_data_blocked_response?

This is a good way to do it as it automatically adds in the CoAP BLOCK2 option if needed.

The original example added NON_BLOCK flag to the file descriptor is this now not required?

You no longer need to handle file descriptors - let the CoAP library do all that for you under the hood.

You make a call to coap_new_endpoint before coap_resource_init. Is this now needed in the latest version?

The call to coap_new_endpoint () sets up the necessary file descriptors for you (that libcoap tracks) and is required for a server.

I see a call to coap_run_once in a continuous while loop. The name would imply it should be called only once?

I did not choose the name - but it is a wrapper for coap_write(), select(), coap_read().

In the continuous while loop in the original example (actually for loop in original) it did a "select" on the FD list, worried about which file descriptor the data came in on and then did a coap_read to get the incoming request. Is none of this required any more?

Correct - the coap_run_once() does all of this for you - no need to worry about file descriptors etc.

leenowell commented 5 years ago

Lol re: updating the comment. I was convinced I had missed it the first time :)

Thanks for all your responses above - looks like this version is much easier to work with :)

I have now built the client and server and run some tests for you. The server unfortunately aborts. I have added some logging and it looks like it aborts in the coap_run_once call. It is not consistent how many times around the loop it takes to fail. Sometimes the first time around and others up to 3 times.

This is the log. The last section seems to repeat a load of times. The "calling coap_run_once" is my log and I have another one just afterwards which you don't see

[0;32mI (2734) CoAP_server: Calling run once

Guru Meditation Error: Core  0 panic'ed (InstrFetchProhibited). Exception was unhandled.

Core 0 register dump:

PC      : 0x3ffc5e70  PS      : 0x00060a30  A0      : 0x80137710  A1      : 0x3ffc5cf0  

A2      : 0x3ffbac38  A3      : 0x3ffc5e70  A4      : 0x00000000  A5      : 0x00000000  

A6      : 0x00000069  A7      : 0x00000066  A8      : 0x8012b2aa  A9      : 0x3ffc5cd0  

A10     : 0x3ffba56c  A11     : 0x00000002  A12     : 0x00000001  A13     : 0x3ffb80fc  

A14     : 0x00000037  A15     : 0x3ffb9e98  SAR     : 0x00000010  EXCCAUSE: 0x00000014  

EXCVADDR: 0x3ffc5e70  LBEG    : 0x4000c349  LEND    : 0x4000c36b  LCOUNT  : 0xffffffff  

ELF file SHA256: ets Jun  8 2016 00:22:57

rst:0x7 (TG0WDT_SYS_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)

configsip: 0, SPIWP:0xee

clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00

mode:DIO, clock div:2

load:0x3fff0018,len:4

load:0x3fff001c,len:6296

load:0x40078000,len:11308

load:0x40080400,len:6680

entry 0x40080760

W (58) boot: PRO CPU has been reset by WDT.

W (58) boot: WDT reset info: PRO CPU PC=0x40080ecc

W (58) boot: WDT reset info: APPÜð™wâ%ç xðT®Skœ…K­).ÇÚùÀ€!Fatal exception (3): LoadStoreError

epc1=0x40008044, epc2=0x00000000, epc3=0x00000000, excvaddr=0x40141ec6, depc=0x00000000
leenowell commented 5 years ago

Ok I finally figured out how to get some further debug information :)

The following is the call stack when abort is called. The line number in the example server file won't match yours as I have added a load of logging to narrow down the issue but it refers to the coap_run_once line. I have the gdb stub loaded so can do limited checking if you need me to

0x40086ee0: invoke_abort at /opt/esp/esp-idf/components/esp32/panic.c:715

0x40087135: abort at /opt/esp/esp-idf/components/esp32/panic.c:715

0x4008e646: xTaskPriorityDisinherit at /opt/esp/esp-idf/components/freertos/tasks.c:5093

0x4008c486: prvCopyDataToQueue at /opt/esp/esp-idf/components/freertos/queue.c:2038

0x4008c7cc: xQueueGenericSend at /opt/esp/esp-idf/components/freertos/queue.c:2038

0x40137c7a: sys_mutex_unlock at /opt/esp/esp-idf/components/lwip/port/esp32/freertos/sys_arch.c:444

0x40137f7a: sys_arch_unprotect at /opt/esp/esp-idf/components/lwip/port/esp32/freertos/sys_arch.c:483

0x4012bcd3: pbuf_free at /opt/esp/esp-idf/components/lwip/lwip/src/core/pbuf.c:1331

0x40138169: netbuf_delete at /opt/esp/esp-idf/components/lwip/lwip/src/api/netbuf.c:85

0x40129b06: lwip_recvfrom at /opt/esp/esp-idf/components/lwip/lwip/src/api/sockets.c:3444

0x40129be5: lwip_recvmsg at /opt/esp/esp-idf/components/lwip/lwip/src/api/sockets.c:3444

0x4012a9ad: lwip_recvmsg_r at /opt/esp/esp-idf/components/lwip/lwip/src/api/sockets.c:3458 (discriminator 4)

0x400e9671: recvmsg at /opt/esp/esp-idf/components/lwip/lwip/src/include/lwip/sockets.h:581
 (inlined by) coap_network_read at /opt/esp/esp-idf/components/coap/libcoap/src/coap_io.c:997

0x400eccf4: coap_read_endpoint at /opt/esp/esp-idf/components/coap/libcoap/src/net.c:185

0x400ed319: coap_read at /opt/esp/esp-idf/components/coap/libcoap/src/net.c:185

0x400e99e7: coap_run_once at /opt/esp/esp-idf/components/coap/libcoap/src/coap_io.c:1350

0x400d3c8b: coap_example_thread at /opt/esp/esp-idf/examples/protocols/coap_server/main/coap_server_example_main.c:187 (discriminator 9)

0x4008cd69: vPortTaskWrapper at /opt/esp/esp-idf/components/freertos/port.c:403
mrdeep1 commented 5 years ago

It is worth adding in coap_set_log_level() to get some extra debugging information.

I assume that you are not playing with any file descriptors - the coap_server code is unchanged apart from adding in extra logging information - correct?

diff --git a/examples/protocols/coap_server/main/coap_server_example_main.c b/examples/protoco
index 1dc75cc..849a956 100644
--- a/examples/protocols/coap_server/main/coap_server_example_main.c
+++ b/examples/protocols/coap_server/main/coap_server_example_main.c
@@ -137,6 +137,7 @@ static void coap_example_thread(void *p)
     espressif_data_len = strlen(espressif_data);
     coap_set_log_handler(logging_handler);
     coap_set_show_pdu_output(0);
+    coap_set_log_level(9);
     while (1) {
         coap_endpoint_t *ep_udp = NULL;
         unsigned wait_ms;
leenowell commented 5 years ago

Yes - just running the standard coap_server from your repository with only my logging added. Just to be sure, I need to add coap_set_log_level(9); just before the while(1) loop?

If so... not great news :( The problem has now moved. It calls the coap_new_endpoint and then core dumps with the following stack. Wonder if this is a different error and the logging itself is causing it log.c is in the call stack....


I (2209) CoAP_server: Connected to AP
I (2209) CoAP_server: About to call coap_new_endpoint
Guru Meditation Error: Core  0 panic'ed (LoadProhibited). Exception was unhandled.
Core 0 register dump:
PC      : 0x4008cbfe  PS      : 0x00060c30  A0      : 0x80082bec  A1      : 0x3ffc67d0  
0x4008cbfe: xQueueTakeMutexRecursive at /opt/esp/esp-idf/components/freertos/queue.c:2038

A2      : 0x0000000c  A3      : 0xffffffff  A4      : 0x00000001  A5      : 0x00000001  
A6      : 0x00060c23  A7      : 0x00000002  A8      : 0x8008ce88  A9      : 0x3ffc67c0  
A10     : 0x00000003  A11     : 0x00060c23  A12     : 0x00060c20  A13     : 0x00000010  
A14     : 0x00000001  A15     : 0x3ffc6144  SAR     : 0x00000004  EXCCAUSE: 0x0000001c  
EXCVADDR: 0x00000010  LBEG    : 0x400014fd  LEND    : 0x4000150d  LCOUNT  : 0xffffffff  

ELF file SHA256: 067667103b6ccad4bd96fefd18766a38aa7e707d7648854072e592bdb610578e

Backtrace: 0x4008cbfe:0x3ffc67d0 0x40082be9:0x3ffc67f0 0x40082cd9:0x3ffc6820 0x400da8e6:0x3ffc6840 0x400ddb41:0x3ffc6b50 0x400829d9:0x3ffc6b80 0x400d3bc2:0x3ffc6bd0 0x4008cd69:0x3ffc6c20
0x4008cbfe: xQueueTakeMutexRecursive at /opt/esp/esp-idf/components/freertos/queue.c:2038

0x40082be9: lock_acquire_generic at /opt/esp/esp-idf/components/newlib/locks.c:157

0x40082cd9: _lock_acquire_recursive at /opt/esp/esp-idf/components/newlib/locks.c:171

0x400da8e6: _vfprintf_r at /Users/ivan/e/newlib_xtensa-2.2.0-bin/newlib_xtensa-2.2.0/xtensa-esp32-elf/newlib/libc/stdio/../../../.././newlib/libc/stdio/vfprintf.c:860 (discriminator 2)

0x400ddb41: vprintf at /Users/ivan/e/newlib_xtensa-2.2.0-bin/newlib_xtensa-2.2.0/xtensa-esp32-elf/newlib/libc/stdio/../../../.././newlib/libc/stdio/vprintf.c:39

0x400829d9: esp_log_write at /opt/esp/esp-idf/components/log/log.c:121

0x400d3bc2: coap_example_thread at /opt/esp/esp-idf/examples/protocols/coap_server/main/coap_server_example_main.c:157

0x4008cd69: vPortTaskWrapper at /opt/esp/esp-idf/components/freertos/port.c:403