ergo-services / ergo

An actor-based Framework with network transparency for creating event-driven architecture in Golang. Inspired by Erlang. Zero dependencies.
https://docs.ergo.services
MIT License
3.67k stars 143 forks source link

Message dropped #94

Closed jiait closed 2 years ago

jiait commented 2 years ago

Currently, it is found that multiple processes send a large number of messages to the same processing process, and the number of messages received by the processing process is discarded. Please kindly ask whether there is a problem with the capacity of mailbox, or we can learn RabbitMQ gen_server2 to handle it!

halturin commented 2 years ago

Could you please try to increase the size of the process mailbox via ProcessOptions? The default size of the mailbox queue is 100. It is definitely too small for the high-loaded gen.Servers. There might be a better idea to use a pool of workers if the increasing mailbox size doesn't help.

jiait commented 2 years ago

At present, I also take the pooling approach, but it is unavoidable that the single point process is high and causes problems. When Erlang Mailbox is full, it will crash and feed back to the application layer. However, Ergo silently drops messages, which will cause abnormal problems in the application layer and lead to the wrong understanding of erGO's instability

halturin commented 2 years ago

When Erlang Mailbox is full, it will crash and feed back to the application layer.

It doesn't work this way in Erlang. Process's mailbox in Erlang has no limit, so if the erlang process can't manage the incoming messages in time, the mailbox is getting grow. In time it's getting growing much faster because of the nature erlang process - message handling it's getting significantly slower by the rising mailbox size. The mailbox is growing until the moment when erlang will be killed by OOM.

I don't like any unlimited things whether it be unlimited mailbox size or infinity timeout for making a sync call request - using (relying on) any of them are leading to the solutions which are broken by design.

Ergo silently drops messages

Ergo prints the Warning message on stdout. Yes, it is not the best solution, but it doesn't drop silently. I was thinking about something like a "system bus" for the global/critical events but still haven't decided to add this feature.

The nature of async messaging doesn't provide any guarantee of delivery. Obviously, If you want to get a guarantee, it is better to use the sync Call method (which is based on async messaging under the hood). It means, it doesn't matter whether you using Erlang or Ergo - none of them let you know about the issue of mailbox overflowing in case of async messaging.

I also take the pooling approach, but it is unavoidable that the single point process is high and causes problems.

for the pool of workers, I use this approach and it costs for the dispatcher almost nothing but provides the guarantee for the sending servers.

  1. source of messages (Server1.. ServerN) --- Call(...) ---> Dispatcher
  2. handle call Dispatcher.HandleCall(process, from, message) // send to the worker as an async message process.Cast(WorkerN, {from,message}) return gen.ServerStatusIgnore // its like noreply in Erlang
  3. handle cast WorkerN.HandleCast(process, message) {from, msg} = message // not literally this way but to understand the meaning. // do something with msg to get replyValue process.SendReply(from, replyValue)

see comment here https://github.com/ergo-services/ergo/blob/master/gen/server.go#L164

BTW: the branch v210 got a significant improvement for the local messaging - the benchmark shows almost 7 times better results for the case of parallel messaging.