jeremyczhen / fdbus

FDBus - Fast Distributed Bus
https://blog.csdn.net/jeremy_cz/article/details/89060291
161 stars 85 forks source link

core dump(CBaseEndpoint::deleteSocket) #21

Closed lijinxing-source closed 3 years ago

lijinxing-source commented 3 years ago

I made preliminary analysis for the source code of CBaseEndpoint class. found a problem, i am not sure. maybe it is my false understanding. I find there is only call "insertEntry" in "CBaseEndpoint::addSocket" , but no call "deleteEntry" in "CBaseEndpoint::deleteSocket". which may cause the "container" never empty. and the following source code could be called for many times: void CBaseEndpoint::deleteSocket(FdbSocketId_t skid) { ... while (!containers.empty()) { EntryContainer_t::iterator it = containers.begin(); delete it->second; // also deleted from socket container } ... }

Finally, when execute the " delete it->second;" for the second time, the core dump will happen. right?

jeremyczhen commented 3 years ago

Deleting an entry that is registered in several tables should be very careful to avoid 1)memory leakage or 2)double free. Regarding this case, a session container is registered into a table of endpoint. When deleting the session container, it is the responsibility of the session container to delete itself from the table: CFdbSessionContainer::~CFdbSessionContainer() { mOwner->deleteEntry(it); }

lijinxing-source commented 3 years ago

OK, many thanks for your answer. what i am afraid is that the function "CBaseEndpoint::deleteSocket with invalid prameter ‘sid’ " can be called in multi-thread at the same time. which may cause the following while loop sentence executing for many times at the same time: void CBaseEndpoint::deleteSocket(FdbSocketId_t skid) { ... while (!containers.empty()) { auto it = containers.begin(); delete it->second; // also deleted from socket container } ... } because we just only release the memory according to "delete it->second". but we do not delete the entry from container. it will cause the value of "!containers.empty()" is 'true' for ever. since the delete entry is only called in the case what you reply to me. For example: Thread 1: "CIntraNameProxy::processClientOnline" -> "client->doDisconnect()" -> "CBaseClient::doDisconnect(FDB_INVALID_ID)" -> "CBaseEndpoint::deleteSocket(FDB_INVALID_ID)" Thread 2: "CBaseEndpoint::~CBaseEndpoint()" -> "CFdbContext::getInstance()->sendSyncEndeavor(new CDestroyJob(this, prepare))" -> "CBaseEndpoint::callDestroyEndpoint" -> "CBaseEndpoint::deleteSocket(FDB_INVALID_ID)".

as long as the "CBaseEndpoint::deleteSocket(FDB_INVALID_ID)" can be called for many times. and during this time the following sentence is not executed : CFdbSessionContainer::~CFdbSessionContainer() { mOwner->deleteEntry(it); } the "delete it->second;" sentence will be executed for many times. right?

maybe i have not understood the overview design. maybe it has avoided the above case from design level. thanks for your help.

jeremyczhen commented 3 years ago

You might find that Mutex is rarely used in FDBus. This is because all processing that are not thread-safe are serialized to run at single thread - FDB_CONTEXT. In the case you concern about, deletion of elements are guaranteed to run at single thread. Thereby no protection is needed.

This is the secret of why FDBus is powerful and stable: it doesn't need to think about complex race conditions and just focus on functionality (if you got chance to look at source code of vsomeip, you will be terrified by the abuse of mutex~~~).

lijinxing-source commented 3 years ago

Many thanks for you. I learned so much.