bmx-ng / bcc

A next-generation bcc parser for BlitzMax
zlib License
33 stars 13 forks source link

[Minor] Avoid "o->clas->func" indirection for function calls #579

Open GWRon opened 2 years ago

GWRon commented 2 years ago

This weekend I made a simple test file measuring how long it takes to call some functions.

GCC (or clang) could automatically optimize function calls (eg inlining them). GCC does not optimize our blitzmax functions written inside types now:

SuperStrict
Framework Brl.StandardIO

Type TTest
    Method CallInternal:Long(times:Int)
        Local result:Long
        For Local i:Int = 0 Until times
            result :+ DoIt(i)
        Next
        Return result
    End Method

    Method CallExternal:Long(times:Int)
        Local result:Long
        For Local i:Int = 0 Until times
            result :+ .DoIt(i)
        Next
        Return result
    End Method

    Function DoIt:Int(X:Int)
        Return X + X
    End Function
End Type

Function DoIt:Int(X:Int)
    Return X + X
End Function

Global t:Int
Global result:Long
Global iterations:Int = 1000000000

'direct
t = MilliSecs()
result = 0
For Local i:Int = 0 Until iterations
    result :+ (i*i)
Next
Print "direct: " + (MilliSecs() - t) + "ms."

'external
t = MilliSecs()
result = 0
For Local i:Int = 0 Until iterations
    result :+ DoIt(i)
Next
Print "external: " + (MilliSecs() - t) + "ms."

'type internal
t = MilliSecs()
result = New TTest.CallInternal(iterations)
Print "type internal: " + (MilliSecs() - t) + "ms."

'type internal
t = MilliSecs()
result = New TTest.CallExternal(iterations)
Print "type external: " + (MilliSecs() - t) + "ms."

(just for your interest:)

--- release build ---
direct: 270ms.
external: 199ms.
type internal: 1323ms.
type external: 171ms.

--- debug build ---
direct: 31825ms.
external: 63973ms.
type internal: 64727ms.
type external: 65604ms.

So ... why is there such a big difference between "internal" calls and "external" calls? It is because of how BCC generates the C Code: The internal call:

        for(;(bbt_i<bbt_);bbt_i=(bbt_i+1)){
            bbt_result=(bbt_result+((BBLONG)o->clas->f_DoIt_i_i(bbt_i)));
        }

With the external call being:

        for(;(bbt_i<bbt_);bbt_i=(bbt_i+1)){
            bbt_result=(bbt_result+((BBLONG)_m_untitled2_DoIt(bbt_i)));
        }

OK ... so why does it do that for the internal function? Because you could override DoIt(x:Int) in a type extending it - so the current BCC way is to do this indirection to allow reacting to the actually used instance (and so possibly extending type). So ... all OK if we define somehow that we do not use any extended type?

We could define to explicitely call TTest.DoIt():

    Method CallInternal:Long(times:Int)
        Local result:Long
        For Local i:Int = 0 Until times
            result :+ TTest.DoIt(i)
        Next
        Return result
    End Method

That way we finally define that we call the function "version" of TTest not some potential extension.

Yet BCC generates the very same code for this portion:

        for(;(bbt_i<bbt_);bbt_i=(bbt_i+1)){
            bbt_result=(bbt_result+((BBLONG)o->clas->f_DoIt_i_i(bbt_i)));
        }

The important thing is here: this also affects "Function collections" (which you might have to "scope" them)

Type Time
  Function Millisecs:Int() 
    'a millisecs implementation handling the overflowing when reaching the max integer value
   return 1234
  End Function
End Type

If BCC correctly identified calls like these here:

print Time.Millisecs()

to be defined enough to no longer need the "indirection" then compilers like GCC and Clang could optimize them.

What I am unsure about is ... globals or constants inside of eg Type Time ... and if a function could access them when now handled "non indirectly accessed". Also am unsure if final could be of help ... so that "final" defined functions (or types) could help bcc to identify what is possibly to call "directly" instead of "indirectly" via o->clas->f_....

Any thoughts on it?

PS: Do not get me wrong, this is just a proposal / extract from the thoughts written down on our discord server.