Closed Nebukadneza closed 5 years ago
Can we first look into possible performance lags in CodiMD itself? Can you tell some more what situation cause the lagging? Like many people editing a document, many documents are open at the same time, …
Besides that, just out of interest: Do you use useCDN = true
?
I would like it to have some insight here to make CodiMD faster in general which may obsoletes such a config.
Hej there,
As always, thanks a lot for your quick reaction!
I have useCDN = false
, but that is only relevant for the client loading static resources, right? Of course these requests for them come in a short period of time, but they should be rather easy to fulfil, compared to actually functional stuff, right?
The situation is as simple as it is dire: a single user (for the first time) loading a single document or the frontpage, with the rest of the codimd instance idle. Honestly, CodiMD is not at fault here, the whole box just has a way too high load, especially the disk. However, since the fix is so simple (increase that one variable, maxLag
of toobusy-js
), I thought I’d make an issue for this nevertheless.
However, if there’s anything i could do to gain some more insight for you, please advise me on some details, and I’ll gladly check ^_^.
Best & Thanks -Dario
The reason I asked for useCDN = true
has an actual performance reason especially when you have a disk latency problem: nodejs is single threaded. Which means it takes the time to read and write your data. The current static files used for useCDN = false
are around 7MB. And they are read and send out every time someone connects. When you use useCDN = true
it's less than 3MB without any decrease in security since we use SRI-hashes.
So yes, it can keep your node process busy and block it from doing more useful stuff. An alternative is to serve the ./public/
sub directories using your reverse proxy (if you use nginx or apache, so it's able to do that).
I will may provide you a patch later, that you can apply and provide some performance information. But have to work that one out first. This should help us to see where high load has an impact on the system.
In my case, I get toobusy-js
503 Service Unavailable errors every time after a restart. This makes sense, because at this point, most of the code is not yet read by disk by V8. This batch of reads from disk generally takes a lot more than 70 ms.
Also, I'm using useCDN = false
(reasons are user privacy and availability).
If one of you wants to give some more insight it would be awesome when you would start the node process with the flag: --prof
this will generate some files within the working directory and their output can be processed using node --prof-process isolate-0xXXXXXXXXXXXX-v8.log
It provides various statics about how much time node spend in which part of the code:
Some more details: https://nodejs.org/en/docs/guides/simple-profiling/
This profile is recorder over about only 3 page loads.
[Summary]:
ticks total nonlib name
211 15.0% 16.1% JavaScript
1026 73.0% 78.2% C++
91 6.5% 6.9% GC
94 6.7% Shared libraries
75 5.3% Unaccounted
Looks like our Codebase is quite optimal. It only appears 5 times and all are done within one CPU tick. So our codebase is not the problem. Yay, good news.
So it seems like all we really need to do is adding an option. I'll purpose a PR. Keep in mind this will go into 1.4.0
Hi,
sorry, it seems I’m a little late now. One thing: indeed serving the static assets with apache helped a lot — now the busy-notifications became much less frequent, but still exist.
As for the profiling, I let it run for a whole day, with the result that nodejs segfaults(!) parsing it. A shorter run of a few users with a few pageloads look very similar to @dsprenkels results, so I’ll save you the hassle ….
Indeed, I’ll be gladly waiting for a PR. Thanks for taking this corner-case issue so serious, @SISheogorath !
Best and thanks a lot! -Dario
One observation: Notice that ~25% of the ticks are in node::(anonymous namespace)::ContextifyScript::New(v8::FunctionCallbackInfo<v8::Value> const&)
, which I think is basically just JIT compilation.
the maxLag
of toobusy is configurable in favor of PR https://github.com/hackmdio/codimd/pull/1239
Thanks everyone!
Hey there,
I’m running CodiMD happily on a slightly aged server … happily? Well, almost! Sometimes, especially when some other services decide they need to trash the pool little DB hard, CodiMD reacts somewhat sluggish. In fact so sluggish that the default
maxLag
of70ms
(?) oftoobusy-js
often kick in. This is slightly unnerving to users, especially on mobile where some browsers inhibit force-reload for some time after a failed load.It would be great if
maxLag
would be configurable in some way for users like me. Of course, I guess we`ll die out eventually, replacing our old, steam-coughing servers with shiny new serverless-hybrid-hyperconverged-cloudnative-supercomputing instance-containers, but … until then, it’d be great if we could help ourselves via configuration instead of code-changes ^_^.Best, and thanks for the great project! -Dario