WIP: Cache Table Distribution

megwill4268 commented 2 years ago

Have been running into issues with cache tables not existing. It's the result of a combination of:

our exchanges/queues were declared pre 5.0 of this plugin and therefore cache tables were never distributed across nodes.
The node that table was originally created on was killed (ECS container that we couldn't get back).

This created an interesting situation where the rabbitmq broker was up, working, but msgs were failing to go through the exchange (stack trace below).

This PR implements distribution of cache tables across nodes at time of routing (same nodes that would be returned by cache_create minus those already on the cache table). This doesn't fix the above issue once it has happened but should reduce the probability of the occurrence. I am not a big fan of the path I took to do this though because it does this check on cache table distribution on every route call to exchange. Being new to elixir and mq plugin development I wasn't sure how to do this in a more elegant/efficient manner. Hoped by raising this you might have ideas on a better implementation.

P.S. I did go down the route of trying to add a copy of the cache table on insert into the cache after the node had died, found out the hard way that you can't create the table because it is still in the schema, can't delete the table b/c it doesn't exist. Trying to mimic this by stopping mnesia, running del_table_copy doesn't delete the vestigial table either.

noxdafox commented 2 years ago

Hello,

thanks for your contribution and sorry for the late reply!

You mentioned that you suffered problems with an old version of the plugin. Did you observe similar issues with a new one? The latest versions of the plugin already replicate the cache on multiple nodes hence mitigating the above problem.

megwill4268 commented 2 years ago

Yes, I was able to get the same issue with a manual test on the new version by bringing up a single node, declaring the exchange and then adding two more nodes, then taking down the original node that is the only one with the cache table declared on in.

My understanding of what I am seeing is that the dedup cache table in mnesia is only added when the exchange is first declared, therefore any nodes that join later don't get that table replicated to them.

noxdafox commented 2 years ago

Right, thanks for providing info on how to reproduce the issue.

The plugin clearly does not handle situations in which the cluster size changes or the nodes are entirely replaced. This is becoming a common issue with VM provides such as AWS EC2.

Your code change is legitimate but the implementation is right now very problematic. We cannot afford to tax every route request for the sake of mitigating a rare issue.

Nevertheless, there is an easy fix for that. We can subscribe to events from the BEAM and react upon the creation of new nodes via the net_kernel:monitor_nodes.

Whenever we receive a nodeup notification, we check for each cache its distribution across the cluster. If the cache is not distributed onto at least 2/3 of the nodes, we replicate it accordingly. Your logic does most of the job, we just need to remove the check on the route call and add the node monitoring logic.

You can add such logic in the cache_manager gen_server which is were it belongs. Would you mind updating the PR or would you prefer me to take over and complete it?

megwill4268 commented 2 years ago

Ah, that is a much better solution. I don't mind trying to do it, but I don't know how to get the name of the cache tables to copy over. Only thing I have thought of so far (found googling) was to maybe try and use Mnesia.schema to get back the table info to parse into table names and then grab all the ones that start with "cache_". Does that sound like the right path or is there another easier way to grab that info?

noxdafox commented 2 years ago

No need for that. The cache_manager tracks the list of caches in its own internal table: message_deduplication_caches.

https://github.com/noxdafox/rabbitmq-message-deduplication/blob/master/lib/rabbitmq_message_deduplication/cache_manager.ex#L93

You can simply iterate over the content of that table.

megwill4268 commented 2 years ago

I didn't test it yet. Thank you for explaining so much.

On Sat, May 28, 2022, 10:06 AM Matteo Cafasso @.***> wrote:

@.**** requested changes on this pull request.

Added few comments with explanations. I hope they help.

I guess you did not manage to test the logic right?

— Reply to this email directly, view it on GitHub https://github.com/noxdafox/rabbitmq-message-deduplication/pull/82#pullrequestreview-988530232, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADH7WB5AP7NWELR56LWMMSDVMIY73ANCNFSM5SG3UEQA . You are receiving this because you authored the thread.Message ID: @.*** com>

megwill4268 commented 2 years ago

So I made the changes you suggested and started testing, it isn't working though. I am kinda stumped on what to do.

noxdafox commented 2 years ago

What kind of errors are you experiencing?

noxdafox commented 1 year ago

@megwill4268, do you plan to continue on this or shall I take over?

megwill4268 commented 1 year ago

Hi Matteo,

I don't have time to work on this right now so you should probably take over.

On Thu, Jul 21, 2022 at 4:03 PM Matteo Cafasso @.***> wrote:

@megwill4268 https://github.com/megwill4268, do you plan to continue on this or shall I take over?

— Reply to this email directly, view it on GitHub https://github.com/noxdafox/rabbitmq-message-deduplication/pull/82#issuecomment-1191933362, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADH7WB6DT7APGRVMKY2EV5LVVG3LZANCNFSM5SG3UEQA . You are receiving this because you were mentioned.Message ID: @.***>

noxdafox commented 1 year ago

I will merge this and continue from there. Thanks for your contribution!

noxdafox commented 1 year ago

A new release supporting cluster growth has been released: 0.6.0.

Please read carefully the changelog for supported versions.

Currently, it only supports addition of nodes. Reason for this is we cannot infer whether a node is down for maintenance/crash or because it has been removed. In other words, we do not have guarantees a node down will never come back up again.

Later on, we could implement a periodic check to see if the cluster size is still the expected one. Right now I would like to see how this version behaves for a while at least.

Thanks for your contribution!

noxdafox / rabbitmq-message-deduplication

WIP: Cache Table Distribution #82