esl / MongooseIM

MongooseIM is Erlang Solutions' robust, scalable and efficient XMPP server, aimed at large installations. Specifically designed for enterprise purposes, it is fault-tolerant and can utilise the resources of multiple clustered machines.
Other
1.64k stars 422 forks source link

Update CETS with pause-on-all nodes fix #4204

Closed arcusfelis closed 4 months ago

arcusfelis commented 4 months ago

This PR addresses MIM-2137. Uses https://github.com/esl/cets/pull/46.

To avoid race conditions we have to pause on all nodes.

The issue:

if join coordinator node looses connection with one of the nodes during join, that node gets unpaused.
when unpaused, the node would start sending remote ops and check_server requests.
there is a chance that other nodes in cluster would receive the messages before send_dump.

I've tried to show the errors in tests.

Solution:

we could delay the messages by unpausing only when all nodes receive DOWN from the cets_join process.
mongoose-im commented 4 months ago

elasticsearch_and_cassandra_26 / elasticsearch_and_cassandra_mnesia / 0674513d93336a5ee1eadfe9b1c55cb2d8322892 Reports root/ big OK: 382 / Failed: 0 / User-skipped: 40 / Auto-skipped: 0


small_tests_25 / small_tests / 0674513d93336a5ee1eadfe9b1c55cb2d8322892 Reports root / small


small_tests_26 / small_tests / 0674513d93336a5ee1eadfe9b1c55cb2d8322892 Reports root / small


small_tests_26_arm64 / small_tests / 0674513d93336a5ee1eadfe9b1c55cb2d8322892 Reports root / small


ldap_mnesia_25 / ldap_mnesia / 0674513d93336a5ee1eadfe9b1c55cb2d8322892 Reports root/ big OK: 2273 / Failed: 0 / User-skipped: 856 / Auto-skipped: 0


ldap_mnesia_26 / ldap_mnesia / 0674513d93336a5ee1eadfe9b1c55cb2d8322892 Reports root/ big OK: 2273 / Failed: 0 / User-skipped: 856 / Auto-skipped: 0


dynamic_domains_mysql_redis_26 / mysql_redis / 0674513d93336a5ee1eadfe9b1c55cb2d8322892 Reports root/ big OK: 4253 / Failed: 0 / User-skipped: 144 / Auto-skipped: 0


dynamic_domains_pgsql_mnesia_25 / pgsql_mnesia / 0674513d93336a5ee1eadfe9b1c55cb2d8322892 Reports root/ big OK: 4286 / Failed: 0 / User-skipped: 111 / Auto-skipped: 0


dynamic_domains_pgsql_mnesia_26 / pgsql_mnesia / 0674513d93336a5ee1eadfe9b1c55cb2d8322892 Reports root/ big OK: 4286 / Failed: 0 / User-skipped: 111 / Auto-skipped: 0


pgsql_cets_26 / pgsql_cets / 0674513d93336a5ee1eadfe9b1c55cb2d8322892 Reports root/ big OK: 4260 / Failed: 0 / User-skipped: 177 / Auto-skipped: 0


dynamic_domains_mssql_mnesia_26 / odbc_mssql_mnesia / 0674513d93336a5ee1eadfe9b1c55cb2d8322892 Reports root/ big OK: 4283 / Failed: 0 / User-skipped: 114 / Auto-skipped: 0


internal_mnesia_26 / internal_mnesia / 0674513d93336a5ee1eadfe9b1c55cb2d8322892 Reports root/ big OK: 2413 / Failed: 0 / User-skipped: 716 / Auto-skipped: 0


pgsql_mnesia_25 / pgsql_mnesia / 0674513d93336a5ee1eadfe9b1c55cb2d8322892 Reports root/ big OK: 4675 / Failed: 0 / User-skipped: 118 / Auto-skipped: 0


pgsql_mnesia_26 / pgsql_mnesia / 0674513d93336a5ee1eadfe9b1c55cb2d8322892 Reports root/ big OK: 4675 / Failed: 0 / User-skipped: 118 / Auto-skipped: 0


mysql_redis_26 / mysql_redis / 0674513d93336a5ee1eadfe9b1c55cb2d8322892 Reports root/ big OK: 4654 / Failed: 0 / User-skipped: 139 / Auto-skipped: 0


mssql_mnesia_26 / odbc_mssql_mnesia / 0674513d93336a5ee1eadfe9b1c55cb2d8322892 Reports root/ big OK: 4672 / Failed: 0 / User-skipped: 121 / Auto-skipped: 0

codecov[bot] commented 4 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (c1d5362) 84.13% compared to head (0674513) 84.31%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #4204 +/- ## ========================================== + Coverage 84.13% 84.31% +0.18% ========================================== Files 549 549 Lines 33430 33430 ========================================== + Hits 28126 28188 +62 + Misses 5304 5242 -62 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.