Open karapirinc opened 1 year ago
Thank you for analysing this and the comments. Indeed, in this case the distribution may be sub-optimal. What's more, in mobile environments stream opening may be more frequent than actual stanza exchanges due to frequent reconnections, which could further emphasise the issue. Idea behind hashCodeForPacket()
is to preserve order of packets for particular user/session, and if session is not yet established, then using packet from seems like a good idea. Feel free to submit Pull Request with proposed changes, and we will gladly review and merge it.
Thanks @woj-tek I am preparing a pull request. Do you have a suggestion for verifying order of the packets? It would be great to know must have test scenarios and maybe a recommendation to automate the process.
We do have https://github.com/tigase/tigase-tts-ng (and also older, legacy and not maintained https://github.com/tigase/tigase-testsuite) which could catch those. Though, in this case, using packetFrom should still maintain correct ordering of packets as incoming ones from the same connection would be processed by the same thread (and queue)
In my tests, packetFrom
was null initially. Please take a look at the logs that I shared. I am considering using packet.getStanzaFrom()
or packet.getStanzaFrom().getResource()
like 127.0.0.1_5222_127.0.0.1_53424
@woj-tek PR is ready for review PR 179
Simple test statistics:
c2s/Processed packets thread IN=[783, 804, 749, 737, 758, 779, 814, 791, 852, 864, 911, 840, 875, 909, 959, 965, 933, 931, 932, 923, 901, 887, 902, 963, 922, 868, 667, 808, 835, 779, 810, 821, 854, 795, 768, 794, 797, 794, 757, 783]
c2s/Processed packets thread OUT=[521, 531, 495, 496, 503, 520, 539, 522, 567, 580, 602, 558, 586, 603, 637, 639, 620, 621, 621, 612, 603, 585, 603, 639, 10579, 577, 425, 540, 557, 522, 538, 546, 567, 530, 513, 530, 532, 531, 504, 520]
c2s/Processed packets thread (outliers) IN=mean: 840,00, deviation: 71,20, outliers: [in_26-c2s:667:x0,79]
c2s/Processed packets thread (outliers) OUT=mean: 807,00, deviation: 1565,35, outliers: [out_24-c2s:10579:x13,11]
sess-man/Processed packets thread IN=[278, 315, 297, 298, 314, 340, 359, 324, 315, 290, 287, 252, 270, 270, 270, 252, 252, 288, 315, 324, 306, 279, 288, 306, 315, 298, 270, 234, 234, 243, 278, 313, 342, 305, 298, 314, 333, 342, 306, 287, 243, 216, 198, 198, 189, 180, 180, 198, 252, 10257, 315, 306, 2154, 333, 367, 387, 368, 333, 306, 288, 297, 306, 315, 333, 297, 279, 2000, 306, 323, 279, 260, 233, 225, 225, 215, 216, 199, 189, 198, 233]
sess-man/Processed packets thread OUT=[326, 372, 349, 340, 367, 395, 421, 382, 367, 335, 340, 297, 318, 2787, 314, 297, 296, 336, 371, 381, 353, 330, 336, 359, 370, 351, 321, 274, 274, 282, 326, 364, 401, 355, 346, 362, 384, 395, 358, 335, 283, 255, 235, 232, 223, 210, 213, 235, 296, 337, 370, 357, 2841, 391, 432, 455, 430, 388, 354, 338, 347, 362, 365, 391, 348, 325, 210, 354, 375, 323, 304, 274, 264, 263, 251, 255, 7683, 222, 231, 274]
sess-man/Processed packets thread (outliers) IN=mean: 449,00, deviation: 1139,54, outliers: [in_49-sess-man:10257:x22,84]
sess-man/Processed packets thread (outliers) OUT=mean: 482,00, deviation: 900,08, outliers: [out_13-sess-man:2787:x5,78, out_52-sess-man:2841:x5,89, out_76-sess-man:7683:x15,94]
message-router/Processed packets thread IN=[1130, 1158, 1079, 1068, 1093, 1125, 1173, 1139, 1230, 11219, 1312, 1212, 1266, 3781, 1383, 1391, 1346, 1345, 1346, 1331, 1303, 1277, 1304, 1389, 1330, 1253, 956, 1168, 1206, 1127, 1168, 1184, 1232, 1148, 1110, 1147, 16027, 1148, 1093, 1129]
message-router/Processed packets thread OUT=[2470]
message-router/Processed packets thread (outliers) IN=mean: 1895,00, deviation: 2774,61, outliers: [in_9-message-router:11219:x5,92, in_36-message-router:16027:x8,46]
message-router/Processed packets thread (outliers) OUT=mean: 2470,00, deviation: 0,00
c2s/Processed packets thread IN=[862, 905, 943, 916, 912, 922, 941, 920, 930, 929, 968, 920, 879, 873, 834, 878, 789, 795, 776, 822, 779, 768, 755, 769, 824, 785, 774, 822, 846, 838, 820, 791, 816, 873, 848, 864, 833, 888, 838, 847]
c2s/Processed packets thread OUT=[820, 858, 896, 870, 870, 871, 897, 871, 885, 884, 923, 871, 846, 833, 794, 832, 754, 754, 741, 780, 741, 728, 715, 728, 780, 741, 740, 780, 807, 794, 780, 755, 781, 832, 806, 819, 793, 845, 794, 806]
c2s/Processed packets thread (outliers) IN=mean: 852,00, deviation: 57,48, outliers: [in_10-c2s:968:x1,14]
c2s/Processed packets thread (outliers) OUT=mean: 810,00, deviation: 54,82, outliers: [out_10-c2s:923:x1,14]
sess-man/Processed packets thread IN=[403, 442, 441, 428, 441, 455, 468, 429, 391, 351, 351, 338, 338, 338, 338, 364, 364, 429, 481, 533, 520, 507, 481, 494, 494, 429, 391, 338, 299, 286, 286, 287, 299, 312, 338, 377, 403, 455, 442, 416, 417, 416, 455, 442, 429, 416, 429, 442, 494, 533, 572, 533, 2107, 495, 456, 468, 390, 325, 260, 247, 221, 221, 234, 234, 286, 312, 2463, 442, 508, 508, 494, 468, 482, 520, 468, 442, 390, 390, 352, 390]
sess-man/Processed packets thread OUT=[520, 549, 554, 547, 561, 562, 567, 527, 480, 441, 447, 443, 436, 432, 423, 445, 449, 504, 544, 577, 564, 546, 523, 543, 551, 499, 474, 428, 410, 421, 424, 427, 435, 438, 459, 485, 504, 533, 506, 470, 465, 455, 490, 478, 472, 460, 477, 509, 559, 601, 632, 608, 581, 578, 547, 562, 495, 438, 380, 381, 371, 375, 380, 376, 412, 427, 460, 517, 547, 533, 515, 480, 493, 526, 493, 484, 442, 462, 457, 500]
sess-man/Processed packets thread (outliers) IN=mean: 451,00, deviation: 306,34, outliers: [in_52-sess-man:2107:x4,67, in_66-sess-man:2463:x5,46]
sess-man/Processed packets thread (outliers) OUT=mean: 488,00, deviation: 60,39, outliers: [out_50-sess-man:632:x1,30]
message-router/Processed packets thread IN=[1927, 1981, 2064, 2021, 2029, 2017, 2064, 2034, 2049, 2051, 2130, 2056, 1994, 1973, 1892, 1967, 1828, 1826, 1793, 1866, 1808, 1779, 1746, 1777, 1871, 1794, 1800, 1847, 1883, 1870, 1837, 1776, 1821, 1908, 1874, 1906, 1856, 1960, 1881, 1899]
message-router/Processed packets thread OUT=[2481]
message-router/Processed packets thread (outliers) IN=mean: 1911,00, deviation: 100,42, outliers: [in_10-message-router:2130:x1,11]
message-router/Processed packets thread (outliers) OUT=mean: 2481,00, deviation: 0,00
c2s/Processed packets thread IN=[1207, 1209, 1234, 1151, 1156, 1153, 1126, 1228, 1107, 1142, 1178, 1283, 1259, 1188, 1288, 1270, 1326, 1241, 1319, 1295, 1325, 1304, 1283, 1332, 1405, 1437, 983, 1310, 1247, 1375, 1281, 1267, 1259, 1404, 1337, 1304, 1280, 1267, 1270, 1166]
c2s/Processed packets thread OUT=[1601, 1607, 1628, 1529, 1530, 1506, 1479, 1603, 1462, 1492, 1546, 1661, 1597, 1533, 1649, 1634, 1699, 1603, 1695, 1663, 1694, 1665, 1633, 1682, 11733, 1815, 1197, 1675, 1594, 1752, 1633, 1636, 1616, 1812, 1743, 1704, 1667, 1667, 1662, 1548]
c2s/Processed packets thread (outliers) IN=mean: 1254,00, deviation: 88,55, outliers: [in_25-c2s:1437:x1,15, in_26-c2s:983:x0,78]
c2s/Processed packets thread (outliers) OUT=mean: 1871,00, deviation: 1582,66, outliers: [out_24-c2s:11733:x6,27]
sess-man/Processed packets thread IN=[1039, 1115, 1103, 952, 958, 973, 998, 1006, 885, 812, 690, 721, 699, 808, 837, 788, 835, 914, 1101, 1182, 1273, 1323, 1245, 1255, 1282, 1406, 1208, 1104, 926, 930, 904, 893, 861, 972, 874, 879, 876, 965, 1021, 919, 891, 825, 869, 924, 922, 888, 846, 960, 941, 11022, 1204, 1304, 2507, 1076, 1158, 1192, 1217, 1033, 990, 818, 775, 697, 732, 760, 811, 757, 1537, 928, 1009, 1179, 1087, 1114, 1115, 1185, 1214, 1191, 1109, 1049, 980, 990]
sess-man/Processed packets thread OUT=[708, 749, 757, 641, 662, 689, 701, 706, 626, 597, 502, 534, 520, 3073, 631, 591, 630, 691, 814, 861, 933, 967, 907, 917, 924, 1015, 867, 790, 674, 678, 654, 643, 620, 693, 613, 628, 617, 669, 725, 640, 630, 592, 620, 665, 655, 18363, 613, 693, 665, 752, 838, 921, 3394, 763, 820, 842, 860, 711, 712, 585, 560, 509, 543, 571, 613, 578, 330, 694, 740, 865, 808, 809, 819, 852, 865, 841, 8231, 744, 686, 695]
sess-man/Processed packets thread (outliers) IN=mean: 1141,00, deviation: 1137,95, outliers: [in_49-sess-man:11022:x9,66]
sess-man/Processed packets thread (outliers) OUT=mean: 1086,00, deviation: 2153,06, outliers: [out_45-sess-man:18363:x16,91, out_76-sess-man:8231:x7,58]
message-archive/Processed packets thread IN=[0, 0, 0, 0, 0, 17634, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
message-archive/Processed packets thread OUT=[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
message-archive/Processed packets thread (outliers) IN=mean: 440,00, deviation: 2753,11, outliers: [in_5-message-archive:17634:x40,08]
message-archive/Processed packets thread (outliers) OUT=mean: 0,00, deviation: 0,00
message-router/Processed packets thread IN=[3355, 3358, 3446, 3268, 3287, 3292, 3245, 3453, 3200, 13267, 3334, 3564, 3461, 5832, 3544, 3499, 3638, 3429, 3701, 3539, 3642, 3578, 3537, 3603, 3731, 3851, 2846, 3609, 3442, 3740, 3557, 3542, 3509, 3797, 3655, 3634, 18347, 3527, 3504, 3336]
message-router/Processed packets thread OUT=[2471]
message-router/Processed packets thread (outliers) IN=mean: 4167,00, deviation: 2760,29, outliers: [in_9-message-router:13267:x3,18, in_36-message-router:18347:x4,40]
message-router/Processed packets thread (outliers) OUT=mean: 2471,00, deviation: 0,00
c2s/Processed packets thread IN=[1178, 1171, 1245, 1191, 1173, 1218, 1222, 1320, 1336, 1296, 1358, 1524, 1475, 1545, 1598, 1620, 1771, 1728, 1591, 1655, 1634, 1555, 1544, 1518, 1447, 1369, 919, 1177, 1264, 1300, 1238, 1278, 1192, 1180, 1149, 1220, 1278, 1223, 1281, 1204]
c2s/Processed packets thread OUT=[1746, 1755, 1778, 1771, 1743, 1819, 1809, 1886, 1935, 1821, 1991, 2131, 2101, 2158, 2224, 2305, 2382, 2305, 2221, 2216, 2223, 2200, 2181, 2171, 2127, 2010, 1549, 1813, 1800, 1908, 1880, 1895, 1864, 1789, 1856, 1835, 1905, 1809, 1894, 1820]
c2s/Processed packets thread (outliers) IN=mean: 1354,00, deviation: 190,79, outliers: [in_16-c2s:1771:x1,31, in_26-c2s:919:x0,68]
c2s/Processed packets thread (outliers) OUT=mean: 1965,00, deviation: 198,37, outliers: [out_16-c2s:2382:x1,21, out_26-c2s:1549:x0,79]
sess-man/Processed packets thread IN=[1067, 1097, 1174, 1203, 1162, 1256, 1236, 1263, 1253, 1153, 1231, 1263, 1231, 1192, 1197, 1217, 1196, 1180, 1138, 1138, 1107, 1092, 1066, 1115, 1091, 1094, 1122, 1068, 1090, 1185, 1211, 1232, 1224, 1171, 1245, 1160, 1221, 1112, 1145, 1056, 984, 967, 927, 897, 912, 884, 885, 945, 990, 1003, 1094, 1195, 2510, 1265, 1318, 1367, 1473, 1438, 1390, 1391, 1418, 1401, 1392, 1358, 1344, 1230, 2002, 1067, 1040, 1044, 964, 940, 912, 897, 874, 931, 934, 958, 1040, 1061]
sess-man/Processed packets thread OUT=[1191, 1206, 1262, 1287, 1275, 1292, 1214, 1205, 1144, 1086, 1129, 1176, 1143, 1081, 1060, 1072, 1098, 1086, 1036, 1015, 990, 961, 965, 1025, 1028, 1038, 1074, 1063, 1125, 1234, 1257, 1273, 1224, 1206, 1231, 1178, 1171, 1030, 1017, 921, 850, 839, 841, 832, 844, 832, 868, 967, 1026, 1059, 1104, 1159, 1164, 1193, 1252, 1285, 1326, 1304, 1279, 1309, 1352, 1350, 1310, 1268, 1238, 1205, 989, 1042, 957, 906, 813, 763, 753, 768, 781, 841, 878, 962, 1095, 1173]
sess-man/Processed packets thread (outliers) IN=mean: 1166,00, deviation: 232,30, outliers: [in_52-sess-man:2510:x2,15, in_66-sess-man:2002:x1,72]
sess-man/Processed packets thread (outliers) OUT=mean: 1085,00, deviation: 162,69, outliers: [out_72-sess-man:753:x0,69]
message-archive/Processed packets thread IN=[0, 0, 0, 0, 0, 15360, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
message-archive/Processed packets thread OUT=[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
message-archive/Processed packets thread (outliers) IN=mean: 384,00, deviation: 2398,08, outliers: [in_5-message-archive:15360:x40,00]
message-archive/Processed packets thread (outliers) OUT=mean: 0,00, deviation: 0,00
message-router/Processed packets thread IN=[3907, 3920, 4005, 4016, 3988, 4067, 4012, 4184, 4229, 4090, 4350, 4598, 4539, 4557, 4662, 4788, 4936, 4825, 4664, 4668, 4696, 4640, 4582, 4592, 4518, 4381, 3737, 4040, 4002, 4170, 4068, 4045, 3952, 3875, 3984, 3972, 4072, 3918, 4128, 4036]
message-router/Processed packets thread OUT=[2465]
message-router/Processed packets thread (outliers) IN=mean: 4260,00, deviation: 321,68, outliers: [in_16-message-router:4936:x1,16]
message-router/Processed packets thread (outliers) OUT=mean: 2465,00, deviation: 0,00
@woj-tek I fixed the unit tests. If you wish,I can improve readability of the tests by changing values with more readable values too.
As your test results states, thread distribution is better. But I notice increased number of queue overflows. I am still analyzing but I think we should be more skeptical about this change.
@karapirinc Yes, please
Merged
After merging TTS tests started to fail with missing packets, which would indicated concurrency issues. For how I reverted the change.
@woj-tek Could you please share reference to the failed tests?
Components distribute the load to the different threads and queues according to the hash code calculation of a packet. Session manager respectively checks not null values of
packet.getPacketFrom
,packet.getPacketTo
andpacket.getStanzaTo().getBareJID()
.For the initial connection establishment process, starting from the
STREAM_OPENED
to theTLS_HANDSHAKE_COMPLETE
commands, the same thread processes all connection attempts. The reason for that ishashCodeForPacket
calculated according to the same valuesess-man@HOSTNAME
.Example
tigase.stats.sess_man_Processed_packets_thread_IN
results in our test env after some period of time.To resolve this bottleneck, shouldn't we use connection id for better thread distribution?
An example from my local machine.
AbstractMessageReceiver.addPacket
andaddPacketNB
logs while opening a stream. QueueIdx is calculated bysess-man@localhost
. (I have 10 CPU core in my machine). Every connection request will end up in queues 6 and 46. Instead, queueIdx could be calculated according toc2s@localhost/127.0.0.1_5222_127.0.0.1_53424
. This would speed up connection establishment process. Better thread utilization Less risk for queue overflow.QueueIdx: 6 NAME: message-router TYPE: set ELEM_NAME: iq COMMAND: STREAM_OPENED PACKET_FROM: null PACKET_TO: null STANZA_FROM: c2s@localhost/127.0.0.1_5222_127.0.0.1_53424 STANZA_TO: sess-man@localhost
QueueIdx: 46 NAME: sess-man TYPE: set ELEM_NAME: iq COMMAND: STREAM_OPENED PACKET_FROM: null PACKET_TO: null STANZA_FROM: c2s@localhost/127.0.0.1_5222_127.0.0.1_53424 STANZA_TO: sess-man@localhost
QueueIdx: 16 NAME: message-router TYPE: result ELEM_NAME: iq COMMAND: null PACKET_FROM: sess-man@localhost PACKET_TO: c2s@localhost/127.0.0.1_5222_127.0.0.1_53424 STANZA_FROM: sess-man@localhost STANZA_TO: c2s@localhost/127.0.0.1_5222_127.0.0.1_53424
QueueIdx: 16 NAME: c2s TYPE: result ELEM_NAME: iq COMMAND: null PACKET_FROM: sess-man@localhost PACKET_TO: c2s@localhost/127.0.0.1_5222_127.0.0.1_53424 STANZA_FROM: sess-man@localhost STANZA_TO: c2s@localhost/127.0.0.1_5222_127.0.0.1_53424
QueueIdx: 6 NAME: message-router TYPE: get ELEM_NAME: iq COMMAND: GETFEATURES PACKET_FROM: null PACKET_TO: null STANZA_FROM: c2s@localhost/127.0.0.1_5222_127.0.0.1_53424 STANZA_TO: sess-man@localhost
QueueIdx: 46 NAME: sess-man TYPE: get ELEM_NAME: iq COMMAND: GETFEATURES PACKET_FROM: null PACKET_TO: null STANZA_FROM: c2s@localhost/127.0.0.1_5222_127.0.0.1_53424 STANZA_TO: sess-man@localhost
QueueIdx: 16 NAME: message-router TYPE: result ELEM_NAME: iq COMMAND: GETFEATURES PACKET_FROM: sess-man@localhost PACKET_TO: c2s@localhost/127.0.0.1_5222_127.0.0.1_53424 STANZA_FROM: sess-man@localhost STANZA_TO: c2s@localhost/127.0.0.1_5222_127.0.0.1_53424
QueueIdx: 16 NAME: c2s TYPE: result ELEM_NAME: iq COMMAND: GETFEATURES PACKET_FROM: sess-man@localhost PACKET_TO: c2s@localhost/127.0.0.1_5222_127.0.0.1_53424 STANZA_FROM: sess-man@localhost STANZA_TO: c2s@localhost/127.0.0.1_5222_127.0.0.1_53424
QueueIdx: 16 NAME: message-router TYPE: null ELEM_NAME: starttls PACKET_FROM: c2s@localhost/127.0.0.1_5222_127.0.0.1_53424 PACKET_TO: sess-man@localhost STANZA_FROM: null STANZA_TO: null
QueueIdx: 96 NAME: sess-man TYPE: null ELEM_NAME: starttls PACKET_FROM: c2s@localhost/127.0.0.1_5222_127.0.0.1_53424 PACKET_TO: sess-man@localhost STANZA_FROM: null STANZA_TO: null
QueueIdx: 39 NAME: message-router TYPE: set ELEM_NAME: iq COMMAND: STARTTLS PACKET_FROM: sess-man@localhost PACKET_TO: null STANZA_FROM: sess-man@localhost STANZA_TO: c2s@localhost/127.0.0.1_5222_127.0.0.1_53424
QueueIdx: 16 NAME: c2s TYPE: set ELEM_NAME: iq COMMAND: STARTTLS PACKET_FROM: sess-man@localhost PACKET_TO: null STANZA_FROM: sess-man@localhost STANZA_TO: c2s@localhost/127.0.0.1_5222_127.0.0.1_53424
QueueIdx: 6 NAME: message-router TYPE: set ELEM_NAME: iq COMMAND: TLS_HANDSHAKE_COMPLETE PACKET_FROM: null PACKET_TO: null STANZA_FROM: c2s@localhost/127.0.0.1_5222_127.0.0.1_53424 STANZA_TO: sess-man@localhost
QueueIdx: 46 NAME: sess-man TYPE: set ELEM_NAME: iq COMMAND: TLS_HANDSHAKE_COMPLETE PACKET_FROM: null PACKET_TO: null STANZA_FROM: c2s@localhost/127.0.0.1_5222_127.0.0.1_53424 STANZA_TO: sess-man@localhost
QueueIdx: 6 NAME: message-router TYPE: get ELEM_NAME: iq COMMAND: GETFEATURES PACKET_FROM: null PACKET_TO: null STANZA_FROM: c2s@localhost/127.0.0.1_5222_127.0.0.1_53424 STANZA_TO: sess-man@localhost
QueueIdx: 46 NAME: sess-man TYPE: get ELEM_NAME: iq COMMAND: GETFEATURES PACKET_FROM: null PACKET_TO: null STANZA_FROM: c2s@localhost/127.0.0.1_5222_127.0.0.1_53424 STANZA_TO: sess-man@localhost
QueueIdx: 16 NAME: c2s TYPE: result ELEM_NAME: iq COMMAND: GETFEATURES PACKET_FROM: sess-man@localhost PACKET_TO: c2s@localhost/127.0.0.1_5222_127.0.0.1_53424 STANZA_FROM: sess-man@localhost STANZA_TO: c2s@localhost/127.0.0.1_5222_127.0.0.1_53424
18.QueueIdx: 16 NAME: message-router TYPE: result ELEM_NAME: iq COMMAND: GETFEATURES PACKET_FROM: sess-man@localhost PACKET_TO: c2s@localhost/127.0.0.1_5222_127.0.0.1_53424 STANZA_FROM: sess-man@localhost STANZA_TO: c2s@localhost/127.0.0.1_5222_127.0.0.1_53424
Tigase version: 8.1.0 JVM flavour and version : openjdk 11.0.7 2020-04-14 Operating system/distribution/version : Ubuntu 18.04.1 LTS INTEL(R) XEON(R) PLATINUM 8259CL CPU @ 2.50GHZ GNU/Linux - 1 CPU - 2 virtual CPU - 4.06G RAM