Closed anyc closed 5 months ago
Interesting issue. Glad that it has been raised. coap_free_context()
has been called which means that the coap_context_t
is now in an unstable state when any app call-backs are invoked.
One of the first things coap_free_context()
does is to free off all the resources - and when an 'observed' resource is freed off, the final observe notification is sent out. However, you will still get triggered by an event and so your close_session() will continue to get called.
In your case, the call to coap_resource_notify_observers()
should silently fail as the coap_context_t
is unstable.
See #1404 for a fix. Please confirm that this works as expected.
It shuts down without an error now, yes. However, valgrind tells me:
==4947== Invalid read of size 1
==4947== at 0x48DFAFC: coap_resource_notify_observers (coap_resource.c:1195)
==4947== by 0x10DB3B: close_session (main.c:996)
==4947== by 0x10E0B3: coap_event_handler (main.c:1171)
==4947== by 0x48E2453: coap_free_endpoint (coap_session.c:1861)
==4947== by 0x48CEE37: coap_free_context_locked (coap_net.c:682)
==4947== by 0x48CEE37: coap_free_context_locked (coap_net.c:639)
==4947== Address 0x51db078 is 0 bytes inside a block of size 100 free'd
==4947== at 0x4867EAC: free (vg_replace_malloc.c:872)
==4947== by 0x48E0623: coap_delete_all_resources (coap_resource.c:591)
==4947== by 0x48CEDCF: coap_free_context_locked (coap_net.c:646)
==4947== by 0x48CEDCF: coap_free_context_locked (coap_net.c:639)
==4947== Block was alloc'd at
==4947== at 0x4864B70: malloc (vg_replace_malloc.c:381)
==4947== by 0x48DDC37: coap_resource_init (coap_resource.c:269)
Ok, gets interesting. coap_delete_all_resources()
has been called, so any resource is no more. You are then calling coap_resource_notify_observers()
with a resource parameter pointing to a piece of de-allocated memory.
I am thinking of going in the direction of no call-back handlers are called after coap_free_context()
has been called. This does however mean that things like close_session()
will no longer be called at this point - and so some custom clean-up will not occur.
It is possible to close down server sessions before calling coap_free_context() which will then triggers the appropriate call-back handles. It is a bit messy but this does the trick. Normally server sessions are left in a pool so they can be re-used and can't just be freed off.
coap_session_set_type_client(session);
coap_session_release(session);
Hm, could we just call coap_delete_all_resources() after coap_free_endpoint()?
Possibly, not sure what the law of unintended consequences will also throw up!
Bottom line is any 'thing' that was associated with coap_context_t
after that 'thing' gets released is problematic.
I am going in the direction that after coap-free_context()
is called, no call-backs are invoked (which means just clearing out all the handler pointers early on in coap_free_context()
) as they inherently can be using unsafe information. I will have a play with this.
Whenever coap_event_handler()
is called with COAP_EVENT_SERVER_SESSION_DEL
which invokes close_session()
in your case, the session is not associated with any observing resources. So, coap_resource_notify_observers()
makes no sense at this point and is not needed.
As a note, whenever an observe request is added to a session, the session reference count is bumped. When the observe is removed from the session, the session reference is decremented. COAP_EVENT_SERVER_SESSION_DEL can only occur when the reference count is 0. coap_free_context()
frees off all the resources (if resource is being observed, a final notify is sent out) which decrements any associated sessions.
So, coap_free_context()
should not disable any call-backs, enabling the call-backs to continue to do any custom cleanups.
@anyc With the removal of the coap_resource_notify_observers()
call when handling the event COAP_EVENT_SERVER_SESSION_DEL
(as it will never do anything for the current session as the reference count is 0 - hence not observing any resource), are there any other application shutdown issues associated with COAP_EVENT_SERVER_SESSION_DEL
you are having with the latest develop branch, or can this Issue now be closed?
Sorry, I am on vacation this week. I am not sure if I understand your reasoning but in my case, notify_observers() does not only notify the peer of the to-be-closed session but possibly also other peers so I cannot remove it from my event handler. To explain a little bit, a peer can acquire a mutex that protects multiple resources. The resources only accept new data if the mutex is locked by the same session and the mutex state is accessible over its own resource. If the session is closed, the lock is automatically released which shall be signaled to other sessions.
Then I think the best thing to do is make all your app tracking resource pointers (the ones you are calling coap_resource_notify_observers()
with) to NULL just before you call coap_free_context()
.
Then, on (the coap_free_context()
) COAP_EVENT_SERVER_SESSION_DEL, there should not be any issues. However, coap_resource_notify_observers()
currently does not like a NULL resource pointer.
EDIT: coap_free_context()
frees off all the resources - if a resource is being observed, all subscribers will get notified.
As this happens during shutdown of my server, I just check now if there is a shutdown in progress and only if not, I send notifications. The process shuts down cleanly now. Thank you!
Hello,
with the current develop branch, my server gets aborted sometimes due to a failed assert during application shutdown:
If a session closes, a resource (that can be observed) changes its value and my SERVER_SESSION_DEL callback will call coap_resource_notify_observers(). If this happens during application shutdown, the assert fails. Is this something my application should avoid during shutdown or is this something libcoap itself should be able to handle?
Thank you!