Open mmeeks opened 1 year ago
Profile seems to suggest that we are serializing some vast area - quite probably a whole spreadsheet selection - and that this is creating lots of columns and so on - which is - far from great.
Lets get the document from Pedro & can you look at this @eszkadev ?
I cannot reproduce case where getTextSelection takes significant piece of a profile. Tested with large selections and others modifying the content.
I think in this ticket we see long-term profile where most of the operations were selection changes, people were clicking here and there without typing or any other actions - so that is why we see it bigger.
In my profiles where editing happens bigger problem is rendering of the tiles or full column selection.
Hmm - I strongly suspect the memory blow-out and swapping which matches that profile and the very significant memory use can only come from that. I imagine that it is specific to Pedro's document which oddly has all 16k columns allocated for me too (causing all manner of problems =)
wrt:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 32675 cool 20 0 20.003t 2.904g 343088 R 103.0 9.263 6:31.37 kitbroker_012
the startling virt num indicates an asan build, for a fresh blank spreadsheet under asan I get
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 11532 cool 20 0 20.003t 2.763g 318000 S 0.000 8.816 0:05.36 kitbroker_003
before we do anything.
I think this was it :
This weeks calc test looks like:
I suspect we should break out the performance problems here that can't be seen in the profile into their own tickets:
And bits we can see in the profile:
I guess the most interesting bits are the user interaction / invalidation problems - we should get a machine setup with no asan for profiling I think.
When I do:
But I also see:
This weeks flamegraph for calc
We should really look at that view switching cost: doc_setView should be the moral equivalent of a single pointer change it's strobed at very high frequency =) Instead in this profile it unzips, and de-compresses a (small) PNG - worth checking if it does it every time - but we should be able to save some 30%+ of our paintPartTile cost there I think :-)
It is also interesting how little the thread pool does to help with the compression and delta generation; I'd like to hope this is because calc is good at only invalidating one tile as we type into a cell - so that is done in the main-thread but ... quite possibly there's a snafu there too.
This weeks flamegraph for calc
This weeks calc's perf flamegraph
https://github.com/CollaboraOnline/online/issues/7337 seems to have worked and at least that has shrunk
Clearly the Calc navigator is out of control in multiview documents wrt doc_setView triggering such much updates. I'll look into that.
Two framegraphs this week: The first for the mostly inactive session until the crash which is visible in profiling
The second one on the restart to end:
Interesting: the mouse event processing seems to have an un-wanted inferior main-loop from chart::ChartControlller::executeDlg_ObjectProperties_withoutUndoGuard - which needs to be async'd:
And of course the amount of work done in doc_setView - to update the UI / notebookbar is visible everywhere; I'll file a separate ticket for that.
@grandinj This ticket is probably interesting for you with its weekly feed of profiling porn :-)
related to @caolanm flamegraphs at comment https://github.com/CollaboraOnline/online/issues/6893#issuecomment-1759411348_ I have created https://github.com/CollaboraOnline/online/pull/7430 to fix a small perf problem.
This weeks calc flamegraph. We're skewed by the threaded formula thing but expanding kitbroker_006 to look at that alone at least we don't have the ridiculous constant creation of a WeldedNotebookbar of last week noted as https://github.com/CollaboraOnline/online/issues/7428 though still too much going on in doc_setView
pre-crash calc profile:
looks like:
a) MergeCells is not an async dialog b) crash is visible as coming from ScDocShell::MoveTable
post-crash calc profile:
The problem of https://github.com/CollaboraOnline/online/issues/7428 is not totally gone, but the cost of it is less now without the notebookbar torn down and recreated for dark mode switching so there seems likely a relation to that in the earlier profiling
Got an apparent modal SvxCharacterMap dialog, some hefty CalendarWrapper work and spell check work.
(1) I note that hunspell is spending a lot of time just calling clock(), which seems odd. Perhaps it should be using clock_gettime(CLOCK_MONOTONIC) instead, to use the fancy mapping trick that Linux does and avoid a syscall.
(2) we seem to be spending a lot of time just figuring out which dictionary to use, not even doing actual dictionary loading or actual spell-checking. Surely that should be cached somewhere?
possible fix for the hunspell issue is here: https://gerrit.libreoffice.org/c/core/+/158809
https://gerrit.libreoffice.org/c/core/+/158830 for the i18npool::Calendar_jewish::mapFromGregorian issue, its nearly always 1999-12-31 so we can just add a fast path for that
This weeks calc profile:
For reference in today's bespoke tests: First session:
@hcvcastro The ScDBData::ExtendBackColorArea is a pretty recent new method I believe, since https://cgit.freedesktop.org/libreoffice/core/commit/?id=768433f07873eb608837630f85e7e1b375239fca Seems to take up quite a bit of time.
Second session:
This week
IRC log in case someone wants to look at the ScPositionHelper thing, I dont know what is going on there:
This weeks. No crashes, so 1h. ScTabViewShell::ExecuteTable has some non-async dialogs still
The calloc in the graph is now at 5% and is unnecessary - no need to memset() there, the data will be completely override during the delta generation process.
If we can safely rid ourselves of the calloc that's great of course =)
Todays little bespoke test, I wonder about ExtendBackColorArea, I only clicked that AutoFilter button two or three times, but it takes up quite a wide part
Just for the record, but poor utility for perf, a hang on save bug dominates
Today's "share" deployment profile
Not sure if https://gerrit.libreoffice.org/c/core/+/160784 is a legitimate optimization for the ScColContainer::resize seen from ScDocument::GetDocStat
Todays profile as far as crash
After:
Crash in 1st profile can be seen as chart::ChartModel::lockController
FWIW SalLayoutGlyphsCache::self has 20000 as the max size of this std::list. While I would expect this to be c++11 std::list I wonder if because these binaries are built with "Red Hat Developer Toolset 10" given /opt/rh/devtoolset-10 in the binaries if they end up built in some C++03 compatible ABI mode.
https://gerrit.libreoffice.org/c/core/+/161105 for that thought
Wow, std::list is an horribly efficient pointer-chasing nightmare surely that shouldn't be used on modern machines =) @grandinj have we looked std::vector there or something else more efficient; I assume it was never intended for 20k items (?) or our access pattern. Seems rather likely that we could have a monotonic 'generation' id on each item, and a high/low watermark for scanning & pruning and removing the oldest things (?) anyhow a fertile place at 25% of the cost of calc to optimize I think =)
Probably not particularly useful as only the tail end of session with 3 participants, but I'll attach anyway.
Well, half of the time in the DeltaGenerator again is perhaps a good sign that nothing more silly is going on (or we get way too many invalidations ;-) - although its really quite surprising the performance of that compared to the (much more complex) rendering. I suspect we should track invalidations vs. tiles and copy across unmodified horizontal lines wherever we can to improve life here without RLE'ing them again =)
Todays calc profile
We are spending an awful lot of time in memset when doing ScGridWindow::PaintTile. Looking at the code, I think that this
ScTableInfo aTabInfo(nEndRow + 3);
is likely to be the problem, especially when scrolling down in documents with lots of rows, because it will an array of RowInfo objects of size nEndRow + 3.
We can probably split that into two structures - one set of RowInfo objects for the rows before the tile starts, and one set of rows for the tile itself. For the rows before the tile starts, we likely need much less info.
ah, makes more sense than my vague thought about the possible RTL bitmap swap at the end of ScGridWindow::PaintTile
Wow - that does look silly, if nEndRow is one million with a large sheet area (nTiledRenderingAreaEndRow) - or a single bad cell at the bottom, that's quite an expense if you're looking at the bottom of the sheet; Surely we only really need row information close to hand for the range of cells we are rendering for, and not this =)
Todays calc which effectively hung, with a different spreadsheet than we typically use
This weeks calc session including crash
Interesting protocol thrashing here: https://github.com/CollaboraOnline/online/issues/8159 from some of this session.
For ease of finding later, I'll add a link to these Calc flamegraphs too:
and some Calc/performance notes too:
first part of the calc session
Editing just now on staging with a larger group we got a number of problems with calc performance that should be improved.
Oddly we were getting large memory consumption for a smallish spreadsheet too - perhaps related to clipboads:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 32675 cool 20 0 20.003t 2.904g 343088 R 103.0 9.263 6:31.37 kitbroker_012
which is really unexpected.