Closed mhasel closed 3 weeks ago
As a side note, is this a good candidate to expand our performance tests to detect potential regressions? That is create a test case with many big aggregate types all passed by value and track their runtime behaviour in our dashboard?
As a side note, is this a good candidate to expand our performance tests to detect potential regressions? That is create a test case with many big aggregate types all passed by value and track their runtime behaviour in our dashboard?
Sounds good. This would also allow to better test future front-end optimizations (e.g. more accurate byte-alignment for memset/memcpy calls, ...)
Aggregate
VAR_INPUT
args to function calls are now generated/passed as pointers and thenmemcpy
d into a local variable instead of passing it by value and usingstore
. In order to achieve this, quite a bit of logic is moved from theexpression_generator
to thepou_generator
- in other words, the caller will now only bitcast an aggregate argument to its pointer (if necessary) and the function will take care of correctlymemset
ing/memcpy
ing. This results in significantly reduced allocations/IR in some cases, especially when passing member variables ofFUNCTION_BLOCK
/PROGRAM
structs or when passing aby-ref
arg on to aby-val
parameter: Where previously the caller had to allocate a local temporary variable and copy the value into it before passing it on to the callee, it is now sufficient to directly pass the pointer.Using the same example as given in issue #1074
the
llc-14 --time-passes
benchmark improves significantly:master/store:
memcpy:
Pass execution timing
andinstruction selection and scheduling
improve by a factor of ~40000 and ~300000 respectively.Resolves https://github.com/PLC-lang/rusty/issues/1074