UnitPerf CSV generation

mmeeks commented 5 months ago

We have a rudimentary performance unit test in tests/UnitPerf.cpp

It generates some numbers - and histogram and so on - which are pretty;

But we want to start capturing and logging that data in a form we can easily re-use, consume and chart - and/or at least monitor.

If you make 'testPerf' fail - by crashing it 'assert(false);' or somesuch; then the Makefile will kindly remind you of how to run just this one test - which is what you want.

Then - it would be ideal to have a set of CSV files - and prolly we should do this in the normal unit tests anyway for good measure and we should in each case use the git hash as the primary key / first item.

We should prolly split CPU, vs, Latency vs. Network - and generate 3x separate CSVs.

CPU should have the run-time in it, and as we go forward - more and more accurate CPU metrics - ideally from the libpfm API not the SysStopwatch class. But for now just getting something we can graph is key.

For Latency - we have a histogram we should horizontal-ize into CSV

And for Network - we should dump incoming & outgoing bandwidth, and then have some defined column headers for each type of thing, and dump the breakdown there so we can see it over time. I expect bandwidth to be the most reliable indicator here - and the others to jitter unhelpfully between runs =)

@Minion3665 can help with code pointers I expect.

Thanks !

amkarn258 commented 5 months ago

Hi @mmeeks ,

Please assign this to me if no one else is assigned yet. I would like to contribute

mmeeks commented 5 months ago

Hi Mayank - thanks so much for getting involved! I filed this for an intern - Elliot over the summer - but as long as you commit your code to a branch regularly, no doubt you could work together with him to improve this :-) I don't expect Elliot to start looking at this for another week or so - so - go for it ! =)

mmeeks commented 4 months ago

Elliot's work merged here: https://github.com/CollaboraOnline/online/pull/9373

mmeeks commented 4 months ago

@elliotfreebairn1 so - some other thoughts for expansion:

how jittery are the numbers - for the same commit ? can you do some stats on that & build a nice spreadsheet & attach here ?
can we record other interactive traces - and/or refresh the traces we have to make them re-playable, I suspect our existing traces are rather out of date
can we connect more traces into the performance testing framework; ideally we'd generate different metrics for each of them https://perf.libreoffice.org/ has some examples of how that might look - and we could re-use / build on that framework.

Thanks ! =)

elliotfreebairn1 commented 4 months ago

@mmeeks For the 1st point, do you mean running the same unit tests on this device a fair few times and analysing the variations in the data?

Minion3665 commented 4 months ago

@mmeeks For the 1st point, do you mean running the same unit tests on this device a fair few times and analysing the variations in the data?

@elliotfreebairn1 correct, I think so - I recall you showing me that the results were pretty stable, but it would be nice to get a spreadsheet here too :)

elliotfreebairn1 commented 4 months ago

@Minion3665 @mmeeks Here are some graphs i created via matplotlib to show the stability/variation in data:

Screenshot from 2024-07-02 15-05-31

Also here is the repository where the script is located: https://github.com/elliotfreebairn1/CSVPerf

mmeeks commented 4 months ago

I would really prefer us to use our own tool & charting engine, share the spreadsheet & have something that can be interacted with =) can you provide the results as a spreadsheet; ideally ODS. Do we have a CPU usage graph ?

I'm also interested if we measure peak memory usage; do we have a metric for that (?) if not we need to think about adding one; doing that at the malloc/free level while possible will cause performance angst - so we should prolly get the kernel's take of memory usage from /proc - at some well defined points: startup, and then per-document (which should subtract that). Mesauring memory is tricky - PSS is generally a good metric to parse out - particularly for one process - can you add something there ?

Thanks!

elliotfreebairn1 commented 4 months ago

@mmeeks Yeah i can definitely get that in a spreadsheet. I've just realised i've missed out the CPU usage measurements inside UnitPerf.cpp, so i will get that graphed aswell.

I've had a look around, and there doesn't seem to be a peak memory usage so will try to implement that as soon as possible and hopefully get in that spreadsheet. Thanks for giving some pointers :)

elliotfreebairn1 commented 4 months ago

Link to spreadsheet: PeformanceCharted.ods

CollaboraOnline / online

UnitPerf CSV generation #9276