ether / etherpad-load-test

CLI tool to simulate pad user load on an Etherpad Instance
Apache License 2.0
5 stars 4 forks source link

Musings and Research #1

Open JohnMcLear opened 4 years ago

JohnMcLear commented 4 years ago

I ran three tests. Performance is roughly documented here for easy comparison.

  1. Local client to Local Server (both in same VM) == 600k
  2. Local client (in VM) to Remote Server (4Gb ram w/ 2 CPU cores) == 300k <-- 220k w/ MySQL?! :D
  3. Local client to Remote Server & Remote Server client to Remote client. == 200k
  4. Local client to Remote server with one author and 200 lurkers. == Never crashed.
  5. Local client to Remote Server w/ one author and 500 lurkers == Uses 100% CPU.

So TLDR; A server can be easily overloaded w/ 500 lurkers & 1 author.

A hyperactive author tries to replicate a really active author pushing 4 characters a second. This is rare in Etherpad *needs citation.

Summary: Etherpad is holding up to ~40 hyperactive authors & ~120 lurkers per pad. At this point things get too slow to really make sense. A safe balance might be ~30 hyperactive authors.

I am going to re-run test #3 because I think something went wrong. After running it again and getting similar results I need to ponder why this is the case..

1. Local client to Local Server (Same VM)

Total = 600k commits Load Test Metrics -- Target Pad http://192.168.1.48:9001/p/tQktzHcrN8chPwyOVab_ Total Clients Connected: 182 Local Clients Connected: 182 Authors Connected: 45 Lurkers Connected: 137 Sent Append messages: 5090 Commits accepted by server: 4989 Commits sent from Server to Client: 592417 Number of commits not yet replied as ACCEPT_COMMIT from server 101

2. Local Client to Remote Server

Total = 300k commits Load Test Metrics -- Target Pad https://embed.etherpad.com:9002/p/Ya0dkFALmk

Total Clients Connected: 146 Local Clients Connected: 148 Authors Connected: 37 Lurkers Connected: 111 Sent Append messages: 3195 Commits accepted by server: 3094 Commits sent from Server to Client: 287995 Number of commits not yet replied as ACCEPT_COMMIT from server 101

3. 2x Client To Server (One clients local to server)

Total == ~200k commits Load Test Metrics -- Target Pad https://embed.etherpad.com:9002/p/foo

Total Clients Connected: 135 Local Clients Connected: 71 Authors Connected: 18 Lurkers Connected: 53 Sent Append messages: 793 Commits accepted by server: 692 Commits sent from Server to Client: 57945 Number of commits not yet replied as ACCEPT_COMMIT from server 101

Server --> Load Test Metrics -- Target Pad https://embed.etherpad.com:9002/p/foo

Clients Connected: 99 Authors Connected: 25 Lurkers Connected: 74 Sent Append messages: 1699 Commits accepted by server: 1598 Commits sent from Server to Client: 137569 Number of commits not yet replied as ACCEPT_COMMIT from server 101

Test 3, run again

Total Revs = 260k Load Test Metrics -- Target Pad https://embed.etherpad.com:9002/p/foonew *note Server client Clients Connected: 117 Authors Connected: 30 Lurkers Connected: 87 Sent Append messages: 2275 Commits accepted by server: 2174 Commits sent from Server to Client: 204316 Number of commits not yet replied as ACCEPT_COMMIT from server 101

Load Test Metrics -- Target Pad https://embed.etherpad.com:9002/p/foonew

*note john's laptop VM. Total Clients Connected: 141 Local Clients Connected: 72 Authors Connected: 18 Lurkers Connected: 54 Sent Append messages: 807 Commits accepted by server: 706 Commits sent from Server to Client: 65414 Number of commits not yet replied as ACCEPT_COMMIT from server 101

4. Local Client to Remote server - a 1 - l 200

Load Test Metrics -- Target Pad https://embed.etherpad.com:9002/p/bsxHnSvB0u

Total Clients Connected: 151 Local Clients Connected: 201 Authors Connected: 1 Lurkers Connected: 200 Sent Append messages: 197 Commits accepted by server: 196 Commits sent from Server to Client: 39200 Seconds test has been running for: 213

Similar findings @ -l 300, @ -l 400 hitting about 10% CPU.

Changing -l to 500 significantly changes things. Server CPU jumps to %114 and connectivity begins failing.

-l @ 450 has same experience 420 hits 100% CPU but goes back to being "stable" but with a notable lag. Note that this is essentially 420 commits per second in traffic but with other overheads too.

Load Test Metrics -- Target Pad https://embed.etherpad.com:9002/p/rogTaX22Ny

Total Clients Connected: 421 Local Clients Connected: 471 Authors Connected: 1 Lurkers Connected: 470 Sent Append messages: 53 Commits accepted by server: 45 Commits sent from Server to Client: 18453 Number of commits not yet replied as ACCEPT_COMMIT from server 8 Seconds test has been running for: 85

This is an important piece of information because it appears that a theoretical limit / restriction of 1:420 appears to exist.

Profiling @ 1:100

I ran


Total Clients Connected: 94
Local Clients Connected: 101
Authors Connected: 1
Lurkers Connected: 100
Sent Append messages: 63
Commits accepted by server: 62
Commits sent from Server to Client: 6200
Seconds test has been running for: 69

This created a 16Mb profile dump. I processed this node --prof-process isolate-0x2cba060-v8.log > processed.txt which threw up a bunch of errors but also it created a profile report..

Statistical profiling result from isolate-0x2cba060-v8.log, (98111 ticks, 599 unaccounted, 0 excluded).

 [Shared libraries]:
   ticks  total  nonlib   name
   2642    2.7%          /usr/bin/node
    722    0.7%          /usr/lib64/libpthread-2.17.so
     91    0.1%          /usr/lib64/libc-2.17.so
     48    0.0%          [vdso]
     19    0.0%          /usr/lib64/libstdc++.so.6.0.19

 [JavaScript]:
   ticks  total  nonlib   name
   9013    9.2%    9.5%  Builtin: ArrayIndexOfSmiOrObject
    559    0.6%    0.6%  Builtin: LoadIC
    407    0.4%    0.4%  Builtin: KeyedLoadIC_Megamorphic
    387    0.4%    0.4%  Builtin: StoreIC
    319    0.3%    0.3%  Builtin: KeyedStoreIC_Megamorphic
    314    0.3%    0.3%  Builtin: InterpreterEntryTrampoline
    221    0.2%    0.2%  Builtin: RegExpReplace
    213    0.2%    0.2%  Builtin: CallFunction_ReceiverIsAny
    204    0.2%    0.2%  LazyCompile: *parse url.js:152:37
    176    0.2%    0.2%  Builtin: ObjectAssign
    160    0.2%    0.2%  Builtin: KeyedLoadIC
    123    0.1%    0.1%  LazyCompile: *hasBinary /home/etherpad/embeddable/src/node_modules/has-binary2/index.js:30:20
    123    0.1%    0.1%  LazyCompile: *Socket.emit /home/etherpad/embeddable/src/node_modules/socket.io/lib/socket.js:140:33
    119    0.1%    0.1%  Builtin: RegExpPrototypeTest
    114    0.1%    0.1%  Builtin: StringAdd_CheckNone_NotTenured
    113    0.1%    0.1%  Builtin: Call_ReceiverIsAny
    109    0.1%    0.1%  Builtin: KeyedStoreIC
    101    0.1%    0.1%  Builtin: CEntry_Return1_DontSaveFPRegs_ArgvOnStack_NoBuiltinExit
     95    0.1%    0.1%  LazyCompile: *format url.js:571:39
     85    0.1%    0.1%  Builtin: StringIndexOf
     85    0.1%    0.1%  Builtin: CompileLazy
     80    0.1%    0.1%  LazyCompile: *getColorDepth internal/tty.js:75:23
     79    0.1%    0.1%  LazyCompile: *client.send /home/etherpad/embeddable/src/node/handler/SocketIORouter.js:72:27
     73    0.1%    0.1%  Builtin: HasProperty
     72    0.1%    0.1%  Builtin: RecordWrite
     71    0.1%    0.1%  Builtin: ArrayPrototypeSlice
 [C++]:
   ticks  total  nonlib   name
   6267    6.4%    6.6%  unibrow::Utf8::CalculateValue(unsigned char const*, unsigned long, unsigned long*)
   6175    6.3%    6.5%  void node::StreamBase::JSMethod<node::LibuvStreamWrap, &(int node::StreamBase::WriteString<(node::encoding)1>(v8::FunctionCallbackI$
   5958    6.1%    6.3%  v8::internal::Handle<v8::internal::String> v8::internal::JsonParser<false>::SlowScanJsonString<v8::internal::SeqTwoByteString, unsi$
   5299    5.4%    5.6%  epoll_pwait
   4587    4.7%    4.8%  unibrow::Utf8DecoderBase::WriteUtf16Slow(unsigned short*, unsigned long, v8::internal::Vector<char const> const&, unsigned long, bo$
   4398    4.5%    4.6%  unibrow::Utf8DecoderBase::Reset(unsigned short*, unsigned long, v8::internal::Vector<char const> const&)
   4080    4.2%    4.3%  node::(anonymous namespace)::DecodeData(v8::FunctionCallbackInfo<v8::Value> const&)
   3990    4.1%    4.2%  __bn_sqr8x_reduction
   2659    2.7%    2.8%  v8::internal::SlicedString::SlicedStringGet(int)
   2255    2.3%    2.4%  mul4x_internal
   1947    2.0%    2.1%  __memmove_ssse3_back
   1588    1.6%    1.7%  v8::internal::FindStringIndicesDispatch(v8::internal::Isolate*, v8::internal::String*, v8::internal::String*, std::vector<int, std:$
    700    0.7%    0.7%  node::fs::Read(v8::FunctionCallbackInfo<v8::Value> const&)
    571    0.6%    0.6%  _int_malloc
    570    0.6%    0.6%  node::contextify::ContextifyContext::CompileFunction(v8::FunctionCallbackInfo<v8::Value> const&)
    462    0.5%    0.5%  v8::internal::JsonParser<false>::ScanJsonString()
    427    0.4%    0.5%  node::EnvGetter(v8::Local<v8::Name>, v8::PropertyCallbackInfo<v8::Value> const&)
    402    0.4%    0.4%  v8::internal::String::SlowEquals(v8::internal::String*)
    385    0.4%    0.4%  __pthread_cond_signal
    375    0.4%    0.4%  v8::internal::String::LastIndexOf(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::inte$
    353    0.4%    0.4%  __GI___libc_malloc
    349    0.4%    0.4%  _int_free
 [Summary]:
   ticks  total  nonlib   name
  18897   19.3%   20.0%  JavaScript
  75091   76.5%   79.4%  C++
   1911    1.9%    2.0%  GC
   3522    3.6%          Shared libraries
    599    0.6%          Unaccounted

 [C++ entry points]:
   ticks    cpp   total   name
  14636   25.3%   14.9%  v8::internal::Builtin_JsonParse(int, v8::internal::Object**, v8::internal::Isolate*)
  13474   23.3%   13.7%  v8::internal::Builtin_HandleApiCall(int, v8::internal::Object**, v8::internal::Isolate*)
   6267   10.8%    6.4%  unibrow::Utf8::CalculateValue(unsigned char const*, unsigned long, unsigned long*)
   4587    7.9%    4.7%  unibrow::Utf8DecoderBase::WriteUtf16Slow(unsigned short*, unsigned long, v8::internal::Vector<char const> const&, unsigned long, bo$
   4398    7.6%    4.5%  unibrow::Utf8DecoderBase::Reset(unsigned short*, unsigned long, v8::internal::Vector<char const> const&)
   2097    3.6%    2.1%  v8::internal::Runtime_StringSplit(int, v8::internal::Object**, v8::internal::Isolate*)
   1731    3.0%    1.8%  v8::internal::Builtin_JsonStringify(int, v8::internal::Object**, v8::internal::Isolate*)
   1544    2.7%    1.6%  v8::internal::Builtin_ErrorConstructor(int, v8::internal::Object**, v8::internal::Isolate*)
    589    1.0%    0.6%  v8::internal::Runtime_StringEqual(int, v8::internal::Object**, v8::internal::Isolate*)
    579    1.0%    0.6%  v8::internal::Runtime_LoadPropertyWithInterceptor(int, v8::internal::Object**, v8::internal::Isolate*)
    482    0.8%    0.5%  v8::internal::Runtime_CompileOptimized_Concurrent(int, v8::internal::Object**, v8::internal::Isolate*)
    463    0.8%    0.5%  v8::internal::Builtin_StringPrototypeLastIndexOf(int, v8::internal::Object**, v8::internal::Isolate*)
    318    0.5%    0.3%  v8::internal::Runtime_CompileLazy(int, v8::internal::Object**, v8::internal::Isolate*)
    316    0.5%    0.3%  __pthread_cond_signal
    249    0.4%    0.3%  v8::internal::Runtime_StringIndexOfUnchecked(int, v8::internal::Object**, v8::internal::Isolate*)
    211    0.4%    0.2%  v8::internal::Builtin_DateConstructor(int, v8::internal::Object**, v8::internal::Isolate*)
    194    0.3%    0.2%  v8::internal::Runtime_RegExpExec(int, v8::internal::Object**, v8::internal::Isolate*)
    186    0.3%    0.2%  int v8::internal::BinarySearch<(v8::internal::SearchMode)1, v8::internal::DescriptorArray>(v8::internal::DescriptorArray*, v8::inte$
    183    0.3%    0.2%  v8::internal::Runtime_StackGuard(int, v8::internal::Object**, v8::internal::Isolate*)
    160    0.3%    0.2%  v8::internal::SourcePositionTableIterator::Advance()
 [Bottom up (heavy) profile]:
  Note: percentage shows a share of a particular caller in the total
  amount of its parent calls.
  Callers occupying less than 1.0% are not shown.

   ticks parent  name
   9013    9.2%  Builtin: ArrayIndexOfSmiOrObject
   7298   81.0%    LazyCompile: *<anonymous> /home/etherpad/embeddable/node_modules/dirty/lib/dirty/dirty.js:156:25
   7298  100.0%      LazyCompile: *emit events.js:147:44
   7298  100.0%        LazyCompile: *lazyFs.read internal/fs/streams.js:160:61
   6851   93.9%          LazyCompile: *callback /home/etherpad/embeddable/src/node_modules/npm/node_modules/graceful-fs/polyfills.js:123:29
   6851  100.0%            LazyCompile: *wrapper fs.js:465:19
    447    6.1%          LazyCompile: ~callback /home/etherpad/embeddable/src/node_modules/npm/node_modules/graceful-fs/polyfills.js:123:29
    447  100.0%            LazyCompile: *wrapper fs.js:465:19
    784    8.7%    Builtin: ArrayIndexOf
    766   97.7%      LazyCompile: ~Dirty.set /home/etherpad/embeddable/node_modules/dirty/lib/dirty/dirty.js:41:31

I decided at this point to switch from dirty to MySQL... I reran the test and dumped the contents then processed it to processed2.txt, during this I also re-ran the 1:420 text, CPU dropped from 100% to 10% with no latency. So the database was the restriction meaning the file system was the restriction but WHY? Because UeberDB is supposed to be caching these values in memory?

Load Test Metrics -- Target Pad https://embed.etherpad.com:9002/p/S1JaAoFBWH

Total Clients Connected: 420 Local Clients Connected: 421 Authors Connected: 1 Lurkers Connected: 420 Sent Append messages: 65 Commits accepted by server: 64 Commits sent from Server to Client: 24148 Seconds test has been running for: 85

New dumps looked like this:

Statistical profiling result from isolate-0x3c20060-v8.log, (14832 ticks, 191 unaccounted, 0 excluded).

 [Shared libraries]:
   ticks  total  nonlib   name
    289    1.9%          /usr/bin/node
     88    0.6%          /usr/lib64/libc-2.17.so
     58    0.4%          /usr/lib64/libpthread-2.17.so
      7    0.0%          [vdso]
      1    0.0%          /usr/lib64/libstdc++.so.6.0.19

 [JavaScript]:
   ticks  total  nonlib   name
    277    1.9%    1.9%  Builtin: LoadIC
    245    1.7%    1.7%  LazyCompile: *parse url.js:152:37
    205    1.4%    1.4%  Builtin: KeyedLoadIC_Megamorphic
    146    1.0%    1.0%  Builtin: InterpreterEntryTrampoline
    101    0.7%    0.7%  LazyCompile: *format url.js:571:39
     96    0.6%    0.7%  Builtin: StoreIC
     85    0.6%    0.6%  Builtin: RegExpReplace
     82    0.6%    0.6%  Builtin: RegExpPrototypeTest
     82    0.6%    0.6%  Builtin: KeyedStoreIC_Megamorphic

 [C++]:
   ticks  total  nonlib   name
   3439   23.2%   23.9%  epoll_pwait
    585    3.9%    4.1%  node::contextify::ContextifyContext::CompileFunction(v8::FunctionCallbackInfo<v8::Value> const&)
    562    3.8%    3.9%  void node::StreamBase::JSMethod<node::TLSWrap, &node::StreamBase::WriteBuffer>(v8::FunctionCallbackInfo<v8::Value> const&)
    443    3.0%    3.1%  __bn_sqr8x_reduction
    279    1.9%    1.9%  mul4x_internal
    171    1.2%    1.2%  __pthread_cond_signal
    112    0.8%    0.8%  void node::StreamBase::JSMethod<node::LibuvStreamWrap, &(int node::StreamBase::WriteString<(node::encoding)1>(v8::FunctionCallbackI$
    103    0.7%    0.7%  __GI_mprotect
    100    0.7%    0.7%  node::contextify::ContextifyScript::New(v8::FunctionCallbackInfo<v8::Value> const&)
     76    0.5%    0.5%  __memmove_ssse3_back
     68    0.5%    0.5%  node::fs::Open(v8::FunctionCallbackInfo<v8::Value> const&)
     66    0.4%    0.5%  _int_malloc
     63    0.4%    0.4%  v8::internal::Factory::NewFixedArrayWithFiller(v8::internal::Heap::RootListIndex, int, v8::internal::Object*, v8::internal::Pretenu$
     63    0.4%    0.4%  node::fs::LStat(v8::FunctionCallbackInfo<v8::Value> const&)
     54    0.4%    0.4%  _int_free

 [C++ entry points]:
   ticks    cpp   total   name
   2052   38.8%   13.8%  v8::internal::Builtin_HandleApiCall(int, v8::internal::Object**, v8::internal::Isolate*)
    313    5.9%    2.1%  v8::internal::Runtime_CompileLazy(int, v8::internal::Object**, v8::internal::Isolate*)
    301    5.7%    2.0%  v8::internal::Builtin_JsonStringify(int, v8::internal::Object**, v8::internal::Isolate*)
    240    4.5%    1.6%  v8::internal::Runtime_CompileOptimized_Concurrent(int, v8::internal::Object**, v8::internal::Isolate*)
    226    4.3%    1.5%  v8::internal::Builtin_JsonParse(int, v8::internal::Object**, v8::internal::Isolate*)
    152    2.9%    1.0%  __pthread_cond_signal
     99    1.9%    0.7%  v8::internal::Runtime_KeyedLoadIC_Miss(int, v8::internal::Object**, v8::internal::Isolate*)
     99    1.9%    0.7%  v8::internal::Runtime_ForInEnumerate(int, v8::internal::Object**, v8::internal::Isolate*)
     95    1.8%    0.6%  v8::internal::Runtime_StackGuard(int, v8::internal::Object**, v8::internal::Isolate*)
     94    1.8%    0.6%  v8::internal::Builtin_GlobalEncodeURIComponent(int, v8::internal::Object**, v8::internal::Isolate*)
     66    1.2%    0.4%  v8::internal::Runtime_StringIndexOfUnchecked(int, v8::internal::Object**, v8::internal::Isolate*)
     66    1.2%    0.4%  v8::internal::Runtime_StoreIC_Miss(int, v8::internal::Object**, v8::internal::Isolate*)
     63    1.2%    0.4%  v8::internal::StringTable::LookupStringIfExists_NoAllocate(v8::internal::String*)
     63    1.2%    0.4%  v8::internal::Runtime_StringSplit(int, v8::internal::Object**, v8::internal::Isolate*)
     57    1.1%    0.4%  v8::internal::Runtime_LoadIC_Miss(int, v8::internal::Object**, v8::internal::Isolate*)
     56    1.1%    0.4%  v8::internal::Builtin_FunctionConstructor(int, v8::internal::Object**, v8::internal::Isolate*)
     50    0.9%    0.3%  __GI___xstat
     40    0.8%    0.3%  v8::internal::Runtime_StringCharCodeAt(int, v8::interna

 [Bottom up (heavy) profile]:
  Note: percentage shows a share of a particular caller in the total
  amount of its parent calls.
  Callers occupying less than 1.0% are not shown.

   ticks parent  name
   3439   23.2%  epoll_pwait

    585    3.9%  node::contextify::ContextifyContext::CompileFunction(v8::FunctionCallbackInfo<v8::Value> const&)
    585  100.0%    v8::internal::Builtin_HandleApiCall(int, v8::internal::Object**, v8::internal::Isolate*)
    585  100.0%      LazyCompile: ~Module._compile internal/modules/cjs/loader.js:705:37
    585  100.0%        LazyCompile: ~Module._extensions..js internal/modules/cjs/loader.js:787:37
    585  100.0%          LazyCompile: ~Module.load internal/modules/cjs/loader.js:645:33
    585  100.0%            LazyCompile: ~tryModuleLoad internal/modules/cjs/loader.js:590:23

    562    3.8%  void node::StreamBase::JSMethod<node::TLSWrap, &node::StreamBase::WriteBuffer>(v8::FunctionCallbackInfo<v8::Value> const&)
    562  100.0%    v8::internal::Builtin_HandleApiCall(int, v8::internal::Object**, v8::internal::Isolate*)
    383   68.1%      RegExp: ^^@ {1}
    383  100.0%        LazyCompile: *Socket._write net.js:725:35
    332   86.7%          LazyCompile: *Writable.write _stream_writable.js:273:36
    201   60.5%            LazyCompile: *send /home/etherpad/embeddable/src/node_modules/engine.io/lib/transports/websocket.js:97:17
    131   39.5%            LazyCompile: *send /home/etherpad/embeddable/src/node_modules/ws/lib/WebSocket.js:352:8
     22    5.7%          LazyCompile: ~doWrite _stream_writable.js:405:17
     22  100.0%            LazyCompile: ~writeOrBuffer _stream_writable.js:365:23
     17    4.4%          LazyCompile: *writeOrBuffer _stream_writable.js:365:23
     12   70.6%            LazyCompile: *send /home/etherpad/embeddable/src/node_modules/ws/lib/Sender.js:266:8
      5   29.4%            LazyCompile: ~Writable.write _stream_writable.js:273:36
     12    3.1%          LazyCompile: *clearBuffer _stream_writable.js:500:21
     11   91.7%            LazyCompile: *onwrite _stream_writable.js:450:17
      1    8.3%            LazyCompile: ~onwrite _stream_writable.js:450:17

So I guess I should redo all my tests with MySQL enabled... :P

Load Test Metrics -- Target Pad https://embed.etherpad.com:9002/p/ooN4XrgwuG

Total Clients Connected: 142
Local Clients Connected: 144
Authors Connected: 36
Lurkers Connected: 108
Sent Append messages: 3209
Commits accepted by server: 3108
Commits sent from Server to Client: 290474
Number of commits not yet replied as ACCEPT_COMMIT from server 101
jose@server:~/etherpad-load-test$

Single device to Server test.

2 re-run w/ MySQL

Load Test Metrics -- Target Pad https://embed.etherpad.com:9002/p/R033xllA4M Total = 218754

Total Clients Connected: 129 Local Clients Connected: 130 Authors Connected: 33 Lurkers Connected: 97 Sent Append messages: 2680 Commits accepted by server: 2579 Commits sent from Server to Client: 218754 Number of commits not yet replied as ACCEPT_COMMIT from server 101

1 Local Client to Local Server w/ MySQL - NODE_ENV == development

TOTAL = 262355 Load Test Metrics -- Target Pad https://embed.etherpad.com:9002/p/ucJKBQ2MqN

Clients Connected: 130 Authors Connected: 33 Lurkers Connected: 97 Sent Append messages: 3076 Commits accepted by server: 2975 Commits sent from Server to Client: 262355 Number of commits not yet replied as ACCEPT_COMMIT from server 101

1 Local Client to Local Server w/ MySQL - NODE_ENV == production

Load Test Metrics -- Target Pad https://embed.etherpad.com:9002/p/kdN2SMfDy3

Clients Connected: 139 Authors Connected: 35 Lurkers Connected: 104 Sent Append messages: 3319 Commits accepted by server: 3218 Commits sent from Server to Client: 299389 Number of commits not yet replied as ACCEPT_COMMIT from server 10

300k... wtf.

Switched back to Dirty, expecting ~600k commits.

Load Test Metrics -- Target Pad https://embed.etherpad.com:9002/p/dqoSh0sQ7V

Clients Connected: 135 Authors Connected: 33 Lurkers Connected: 102 Sent Append messages: 3336 Commits accepted by server: 3235 Commits sent from Server to Client: 300189 Number of commits not yet replied as ACCEPT_COMMIT from server 101 [root@JohnEtherpad jose]#

No, same, about 300k...

On my 1Gb VM I'm getting ~600k with Redis as backend..

Running same test with dirty..

550k & 512k with dirty. So avg of 530k.. Poss due to new DB?

Switching back to redis.. Redis on a new DB.

Redis reporting ~500k, so no change.. Dirty doesn't appear to be restricting..

I'm scratching my head a bit, might need to sleep on this.

Running the client on my laptop (not in a VM - windows 10 node 12.6 pointing it at the 1Gb VM) I get.

Load Test Metrics -- Target Pad http://192.168.1.48:9001/p/PL7L3fuhDO

Local Clients Connected: 132 Authors Connected: 33 Lurkers Connected: 99 Sent Append messages: 2787 Commits accepted by server: 2686 Commits sent from Server to Client: 233141 Number of commits not yet replied as ACCEPT_COMMIT from server 101

So is the client the limiting factor here? I'm going to target two users at the same pad..

I'm taking a break for today, it's been a bit of a confusing set of numbers but overall things look fairly okay except that a single client if not rate limited can do a DoS by simulating 400 lurkers... 1.9 will implement rateLimiting so I guess now we can figure out how many msgs per second seems fair per IP address?

JohnMcLear commented 4 years ago

Day 2. I did some reading this morning, SocketIO essentially has a hard limit of 10k connections p/sec

http://drewww.github.io/socket.io-benchmarking/

the max messages-sent-per-second rate is around 9,000–10,000 depending on the concurrency level

I included a value showing we are hitting a SocketIO limit.

All tests done w/ 1Gb ram on a VM w/ 2.7Ghz CPU.

Load Test Metrics -- Target Pad http://192.168.1.48:9001/p/kRBzLAeSfI

Total Clients Connected: 157
Local Clients Connected: 156
Authors Connected: 39
Lurkers Connected: 117
Sent Append messages: 3913
Commits accepted by server: 3812
Commits sent from Server to Client: 393671
Current rate per second of Commits sent from Server to Client: 0
Mean(per second) of # of Commits sent from Server to Client: 1832
Max(per second) of # of Messages (SocketIO has cap of 10k): 10415
Number of commits not yet replied as ACCEPT_COMMIT from server 101

Things to note:

JohnMcLear commented 4 years ago

So now we know what we need to know we can start thinking about how Etherpad can be changed to meet these restrictions.

Before I propose changes I think it's worth stating just how little 10k messages per second is. When you have 100 lurkers it's only 10 edits per second. That is nothing. We're testing 39 edits per second with 117 lurkers which is beyond the theoretical limit.

One limit you could say is that max users could be 10k because theoretically if each user was on their own pad writing one edit per second Etherpad can support this. You can probably half this number to 5k users just for sanity.

Now we can say for each pad we should limit total users to 100 (which is super high) but actually, is kinda dumb because we know we can support 1:400...

The best thing might to say "once we're at X amount of messages being sent from SocketIO per second reject new connections". I'd say a safe limit is somewhere in the region of ~5k and this can be modified/adjusted by an admin...

Another option is to look at replacing socketio with ws, which IMHO doesn't really solve the problem as it's probable 5k per socketIO thread is fine.

TODO

  1. Look to see if we can get/set max socket messages per second value and reject new socketIO connections if we're at a set max value. -- Note that we have a PR in for rate limiting.
  2. Look to see if ws is an easy enough drop in.
  3. Consider the long term solutions here.. We recommend doing "sticky" per pad sharding. I guess what you want to know is if Etherpad Server 1 is getting X connections then spin up an additional VM and shard new pads onto that new server. Database can be shared, each Shard would basically be there just to support the new Sockets.
JohnMcLear commented 4 years ago

On my windows machine

Load Test Metrics -- Target Pad http://127.0.0.1:9001/p/ExUrRM3KNP

Local Clients Connected: 120
Authors Connected: 30
Lurkers Connected: 90
Sent Append messages: 1399
Commits accepted by server: 1383
Commits sent from Server to Client: 164489
Current rate per second of Commits sent from Server to Client: 0
Mean(per second) of # of Commits sent from Server to Client: 8354
Max(per second) of # of Messages (SocketIO has cap of 10k): **22515**
Number of commits not yet replied as ACCEPT_COMMIT from server 16
Seconds test has been running for: 21

This seems to be the maximum I can achieve as far as messages per second on Windows. Interestingly I can accomplish ~1K lurkers to 1 author on this machine.

An update on TODO:

1. Still to do.

  1. Not really.
  2. I think some sort of padId based session <> load balancing setup will work. If we did it for one load balancer software then in theory other people can port to others. I think just one is within the scope of work here...
JohnMcLear commented 4 years ago

https://github.com/hashrocket/websocket-shootout/blob/master/results/round-01.md is interesting read

as is https://www.hackdoor.io/articles/6xQkgQo4/differences-between-websockets-and-socketio