CollaboraOnline / online

Collabora Online is a collaborative online office suite based on LibreOffice technology. This is also the source for the Collabora Office apps for iOS and Android.
https://collaboraonline.com
Other
1.77k stars 679 forks source link

Generic Calc performance ticket #6893

Open mmeeks opened 1 year ago

mmeeks commented 1 year ago

Editing just now on staging with a larger group we got a number of problems with calc performance that should be improved.

perf-23 05-calc

Oddly we were getting large memory consumption for a smallish spreadsheet too - perhaps related to clipboads:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 32675 cool 20 0 20.003t 2.904g 343088 R 103.0 9.263 6:31.37 kitbroker_012

which is really unexpected.

mmeeks commented 1 year ago

Profile seems to suggest that we are serializing some vast area - quite probably a whole spreadsheet selection - and that this is creating lots of columns and so on - which is - far from great.

mmeeks commented 1 year ago

Lets get the document from Pedro & can you look at this @eszkadev ?

eszkadev commented 1 year ago

I cannot reproduce case where getTextSelection takes significant piece of a profile. Tested with large selections and others modifying the content.

I think in this ticket we see long-term profile where most of the operations were selection changes, people were clicking here and there without typing or any other actions - so that is why we see it bigger.

In my profiles where editing happens bigger problem is rendering of the tiles or full column selection.

mmeeks commented 1 year ago

Hmm - I strongly suspect the memory blow-out and swapping which matches that profile and the very significant memory use can only come from that. I imagine that it is specific to Pedro's document which oddly has all 16k columns allocated for me too (causing all manner of problems =)

caolanm commented 1 year ago

wrt:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 32675 cool 20 0 20.003t 2.904g 343088 R 103.0 9.263 6:31.37 kitbroker_012

the startling virt num indicates an asan build, for a fresh blank spreadsheet under asan I get

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 11532 cool 20 0 20.003t 2.763g 318000 S 0.000 8.816 0:05.36 kitbroker_003

before we do anything.

pedropintosilva commented 1 year ago

I think this was it :

caolanm commented 1 year ago

This weeks calc test looks like: perf-calc

mmeeks commented 1 year ago

I suspect we should break out the performance problems here that can't be seen in the profile into their own tickets:

And bits we can see in the profile:

I guess the most interesting bits are the user interaction / invalidation problems - we should get a machine setup with no asan for profiling I think.

caolanm commented 1 year ago

When I do:

But I also see:

caolanm commented 1 year ago

This weeks flamegraph for calc perf-staging-calc-2023-08-24

mmeeks commented 1 year ago

We should really look at that view switching cost: doc_setView should be the moral equivalent of a single pointer change it's strobed at very high frequency =) Instead in this profile it unzips, and de-compresses a (small) PNG - worth checking if it does it every time - but we should be able to save some 30%+ of our paintPartTile cost there I think :-)

It is also interesting how little the thread pool does to help with the compression and delta generation; I'd like to hope this is because calc is good at only invalidating one tile as we type into a cell - so that is done in the main-thread but ... quite possibly there's a snafu there too.

caolanm commented 11 months ago

This weeks flamegraph for calc

perf-8556

caolanm commented 11 months ago

This weeks calc's perf flamegraph

perf-22435

https://github.com/CollaboraOnline/online/issues/7337 seems to have worked and at least that has shrunk

caolanm commented 11 months ago

Clearly the Calc navigator is out of control in multiview documents wrt doc_setView triggering such much updates. I'll look into that.

caolanm commented 11 months ago

Two framegraphs this week: The first for the mostly inactive session until the crash which is visible in profiling perf-19099

The second one on the restart to end: perf-21240

mmeeks commented 11 months ago

Interesting: the mouse event processing seems to have an un-wanted inferior main-loop from chart::ChartControlller::executeDlg_ObjectProperties_withoutUndoGuard - which needs to be async'd:

image

And of course the amount of work done in doc_setView - to update the UI / notebookbar is visible everywhere; I'll file a separate ticket for that.

mmeeks commented 11 months ago

@grandinj This ticket is probably interesting for you with its weekly feed of profiling porn :-)

grandinj commented 11 months ago

related to @caolanm flamegraphs at comment https://github.com/CollaboraOnline/online/issues/6893#issuecomment-1759411348_ I have created https://github.com/CollaboraOnline/online/pull/7430 to fix a small perf problem.

caolanm commented 11 months ago

This weeks calc flamegraph. We're skewed by the threaded formula thing but expanding kitbroker_006 to look at that alone at least we don't have the ridiculous constant creation of a WeldedNotebookbar of last week noted as https://github.com/CollaboraOnline/online/issues/7428 though still too much going on in doc_setView

perf-14321

caolanm commented 10 months ago

pre-crash calc profile:

perf-19984

looks like:

a) MergeCells is not an async dialog b) crash is visible as coming from ScDocShell::MoveTable

post-crash calc profile:

perf-21534

The problem of https://github.com/CollaboraOnline/online/issues/7428 is not totally gone, but the cost of it is less now without the notebookbar torn down and recreated for dark mode switching so there seems likely a relation to that in the earlier profiling

caolanm commented 10 months ago

perf-15373

Got an apparent modal SvxCharacterMap dialog, some hefty CalendarWrapper work and spell check work.

grandinj commented 10 months ago

(1) I note that hunspell is spending a lot of time just calling clock(), which seems odd. Perhaps it should be using clock_gettime(CLOCK_MONOTONIC) instead, to use the fancy mapping trick that Linux does and avoid a syscall.

(2) we seem to be spending a lot of time just figuring out which dictionary to use, not even doing actual dictionary loading or actual spell-checking. Surely that should be cached somewhere?

grandinj commented 10 months ago

possible fix for the hunspell issue is here: https://gerrit.libreoffice.org/c/core/+/158809

caolanm commented 10 months ago

https://gerrit.libreoffice.org/c/core/+/158830 for the i18npool::Calendar_jewish::mapFromGregorian issue, its nearly always 1999-12-31 so we can just add a fast path for that

caolanm commented 10 months ago

This weeks calc profile: perf-6686

caolanm commented 10 months ago

For reference in today's bespoke tests: First session: perf-27903

@hcvcastro The ScDBData::ExtendBackColorArea is a pretty recent new method I believe, since https://cgit.freedesktop.org/libreoffice/core/commit/?id=768433f07873eb608837630f85e7e1b375239fca Seems to take up quite a bit of time.

Second session: perf-29178

caolanm commented 9 months ago

perf-18345

This week

grandinj commented 9 months ago

IRC log in case someone wants to look at the ScPositionHelper thing, I dont know what is going on there:

void ScGridWindow::PaintTile( VirtualDevice& rDevice, - doing a big/slow __memset is unexpected - particularly that the size is larger than the pixel-size of the tile - which is quite large ... * mmeeks wonders if some big hash-table or STL type is fiddled with there. @noel_grandin: seems there is a ton of work done on HandleMouseEvent =) 12.5% of total time handling mouse events seems extraordinary =) a greath swath of GetRowHeight going on there - from the drawing-layer =) weird, if we are hitting GetRowHeight that means that the ScPositionHelper cache is not doing its job
caolanm commented 9 months ago

perf-15753

This weeks. No crashes, so 1h. ScTabViewShell::ExecuteTable has some non-async dialogs still

grandinj commented 9 months ago

The calloc in the graph is now at 5% and is unnecessary - no need to memset() there, the data will be completely override during the delta generation process.

mmeeks commented 9 months ago

If we can safely rid ourselves of the calloc that's great of course =)

caolanm commented 9 months ago

perf-10543

Todays little bespoke test, I wonder about ExtendBackColorArea, I only clicked that AutoFilter button two or three times, but it takes up quite a wide part

caolanm commented 9 months ago

Just for the record, but poor utility for perf, a hang on save bug dominates

perf-1543

caolanm commented 9 months ago

Today's "share" deployment profile

perf-71519

caolanm commented 9 months ago

Not sure if https://gerrit.libreoffice.org/c/core/+/160784 is a legitimate optimization for the ScColContainer::resize seen from ScDocument::GetDocStat

caolanm commented 8 months ago

Todays profile as far as crash

perf-170021

caolanm commented 8 months ago

After: perf-172885

Crash in 1st profile can be seen as chart::ChartModel::lockController

caolanm commented 8 months ago

hotspot

hotspot2

FWIW SalLayoutGlyphsCache::self has 20000 as the max size of this std::list. While I would expect this to be c++11 std::list I wonder if because these binaries are built with "Red Hat Developer Toolset 10" given /opt/rh/devtoolset-10 in the binaries if they end up built in some C++03 compatible ABI mode.

https://gerrit.libreoffice.org/c/core/+/161105 for that thought

mmeeks commented 8 months ago

Wow, std::list is an horribly efficient pointer-chasing nightmare surely that shouldn't be used on modern machines =) @grandinj have we looked std::vector there or something else more efficient; I assume it was never intended for 20k items (?) or our access pattern. Seems rather likely that we could have a monotonic 'generation' id on each item, and a high/low watermark for scanning & pruning and removing the oldest things (?) anyhow a fertile place at 25% of the cost of calc to optimize I think =)

caolanm commented 8 months ago

perf-441885

Probably not particularly useful as only the tail end of session with 3 participants, but I'll attach anyway.

mmeeks commented 8 months ago

Well, half of the time in the DeltaGenerator again is perhaps a good sign that nothing more silly is going on (or we get way too many invalidations ;-) - although its really quite surprising the performance of that compared to the (much more complex) rendering. I suspect we should track invalidations vs. tiles and copy across unmodified horizontal lines wherever we can to improve life here without RLE'ing them again =)

caolanm commented 7 months ago

Todays calc profile

perf-680008

grandinj commented 7 months ago

We are spending an awful lot of time in memset when doing ScGridWindow::PaintTile. Looking at the code, I think that this

ScTableInfo aTabInfo(nEndRow + 3);

is likely to be the problem, especially when scrolling down in documents with lots of rows, because it will an array of RowInfo objects of size nEndRow + 3.

We can probably split that into two structures - one set of RowInfo objects for the rows before the tile starts, and one set of rows for the tile itself. For the rows before the tile starts, we likely need much less info.

caolanm commented 7 months ago

ah, makes more sense than my vague thought about the possible RTL bitmap swap at the end of ScGridWindow::PaintTile

mmeeks commented 7 months ago

Wow - that does look silly, if nEndRow is one million with a large sheet area (nTiledRenderingAreaEndRow) - or a single bad cell at the bottom, that's quite an expense if you're looking at the bottom of the sheet; Surely we only really need row information close to hand for the range of cells we are rendering for, and not this =)

caolanm commented 7 months ago

perf-128979

Todays calc which effectively hung, with a different spreadsheet than we typically use

caolanm commented 7 months ago

perf-107896

This weeks calc session including crash

mmeeks commented 7 months ago

Interesting protocol thrashing here: https://github.com/CollaboraOnline/online/issues/8159 from some of this session.

Tex2002ans commented 7 months ago

For ease of finding later, I'll add a link to these Calc flamegraphs too:

and some Calc/performance notes too:

caolanm commented 7 months ago

perf-183388

first part of the calc session