jeremyczhen / fdbus

FDBus - Fast Distributed Bus
https://blog.csdn.net/jeremy_cz/article/details/89060291
161 stars 85 forks source link

高频率调用invoke时会产生oom #11

Closed sunclgogo closed 4 years ago

sunclgogo commented 4 years ago

hi jeremy @jeremyczhen 在高频调用invoke时、FDB_CONTEXT线程会发生oom、请问这个问题跟最近修改的内存泄漏问题有关么?invoke使用的原型参照下面: nvoke(FdbMsgCode_t code , const CFdbBasePayload &data , int32_t timeout)

jeremyczhen commented 4 years ago

能把你正在使用的commit号给我吗?

另外,invoke()通过job把CFdbMessage抛给FDB_CONTEXT线程,后者通过socket把job里的数据发给socket对端。抛job可以在很短时间内完成,而socket通信相对较慢。如果高频调invoke(),context线程会来不及处理,导致大量message排在job队列里,积累下来有可能导致OOM。

sunclgogo commented 4 years ago

@jeremyczhen commit应该是804ba7bf9cacdf882884d27d276b83123c5ed341、我们这用的比较早。 最近调查这个问题时发现、将invoke改成send后、问题好像得到改善、怀疑跟invoke的auto reply有关系、send好像没有auto reply的机制吧?

jeremyczhen commented 4 years ago

Issue is fixed. The root cause is: at server side, auto-reply will be sent if message is not manually replied in onInvoke() to cancel pending request at client side. But in some cases auto reply does not take effect, leading to accumulation of pending request, eventually OOM.

@sunclgogo Please find branch hotfix_autoreply_for_sunclgogo for the fix-up. It is based on 804ba7bf9cacdf882884d27d276b83123c5ed341. If you don't want to merge this patch, you should check and ensure CBaseMessage::reply() is called in all onInvoke().

Please also keep in mind that if client doesn't expect response from server, send() is preferred since it doesn't keep pending request for each call.

sunclgogo commented 4 years ago

@jeremyczhen well noted, thanks for your quick turnaround to fix the issue。