maxmods / brl.mod

BlitzMax's BRL modules, patched and updated.
4 stars 3 forks source link

Reflection: use TLowerString for FindMethod/Find*** #17

Closed GWRon closed 6 years ago

GWRon commented 6 years ago

Just a basic idea - as I try to improve the CPU consumption when having some heavy LUA-BlitzMax interaction:

Currently reflection.mod functions like FindMethod() call string.ToLower() on each comparison of "searched" vs "provided":

    Method FindMethod:TMethod( name$ )
        name=name.ToLower()
        For Local t:TMethod=EachIn _methods
            If t.Name().ToLower()=name Return t
        Next
        If _super Return _super.FindMethod( name )
    End Method

We could either use TLowerString there - or store two fields: the original _name:string and a _lowerStringName:string to at least save computation time in this scenario.

If we used Brucey's TLowerString we somehow need to make sure it is made available to all - without importing it multiple times. How to add support for TLowerString the most "compatible" way?

woollybah commented 6 years ago

Having a "lower" field would probably be the preferred way, but wouldn't help the initial name lowering per call. I don't think using another type here is a good idea.

GWRon commented 6 years ago

I modified my local "extendedReflection" class to use the lowerString - and "used times" by my lua class reduced by up to 40-50%. But of course this is mostly for function calls just returning simple values. The processing time of more complex functions wont get auto-optimized that way :-)

Will test it a bit and create a pull request for both modules (ng and vanilla). Also replacing name().compare(prev.name()) = 0 with the basic _nameLowerCase = prev._nameLowerCase to save another object-casting-function.

GWRon commented 6 years ago

Doing some "simple Lua stuff"-tests (calling 1million times a lua function which calls a BlitzMax function - and calling a lua function whichcalls 1 million times a blitzmax function) lead to 10% execution time reduction. This sounds way more realistic than execution time reductions of 40%.

But the exposed objects are way simpler - and especially the checks to find an entry in a list will be "harder" once I use an more complex object. Will test that later on.

GWRon commented 6 years ago

Ok, adding 25 methods before the method of interest almost doubled execution time. So the "findMethod" approach (TList) is kind of the culprit in that situation. I cleaned out "other stuff" (eg. calling things via Lua) and measured the pure "reflection based invokation".

Just check runtimes for this:

SuperStrict
Framework BRL.StandardIO
'Import "../../external/reflectionExtended/reflection.bmx"
Import Brl.Reflection

Type TMyObject
    Field prop:Int = 10
    Field arr:Int[] = [1,2,3]

rem
    Method OtherSimpleGetA:int();return 1;End Method
    Method OtherSimpleGetB:int();return 1;End Method
    Method OtherSimpleGetC:int();return 1;End Method
    Method OtherSimpleGetD:int();return 1;End Method
    Method OtherSimpleGetE:int();return 1;End Method
    Method OtherSimpleGetF:int();return 1;End Method
    Method OtherSimpleGetG:int();return 1;End Method
    Method OtherSimpleGetH:int();return 1;End Method
    Method OtherSimpleGetI:int();return 1;End Method
    Method OtherSimpleGetJ:int();return 1;End Method
    Method OtherSimpleGetK:int();return 1;End Method
    Method OtherSimpleGetL:int();return 1;End Method
    Method OtherSimpleGetM:int();return 1;End Method
    Method OtherSimpleGetN:int();return 1;End Method
    Method OtherSimpleGetO:int();return 1;End Method
    Method OtherSimpleGetP:int();return 1;End Method
    Method OtherSimpleGetQ:int();return 1;End Method
    Method OtherSimpleGetR:int();return 1;End Method
    Method OtherSimpleGetS:int();return 1;End Method
    Method OtherSimpleGetT:int();return 1;End Method
    Method OtherSimpleGetU:int();return 1;End Method
    Method OtherSimpleGetV:int();return 1;End Method
    Method OtherSimpleGetW:int();return 1;End Method
    Method OtherSimpleGetX:int();return 1;End Method
    Method OtherSimpleGetY:int();return 1;End Method
    Method OtherSimpleGetZ:int();return 1;End Method
endrem

    Method SimpleGet:int()
        return 1
    End Method

    Method SimpleGetA:int();return 1;End Method
    Method SimpleGetB:int();return 1;End Method
    Method SimpleGetC:int();return 1;End Method
    Method SimpleGetD:int();return 1;End Method
    Method SimpleGetE:int();return 1;End Method
    Method SimpleGetF:int();return 1;End Method
    Method SimpleGetG:int();return 1;End Method
    Method SimpleGetH:int();return 1;End Method
    Method SimpleGetI:int();return 1;End Method
    Method SimpleGetJ:int();return 1;End Method
    Method SimpleGetK:int();return 1;End Method
    Method SimpleGetL:int();return 1;End Method
    Method SimpleGetM:int();return 1;End Method
    Method SimpleGetN:int();return 1;End Method
    Method SimpleGetO:int();return 1;End Method
    Method SimpleGetP:int();return 1;End Method
    Method SimpleGetQ:int();return 1;End Method
    Method SimpleGetR:int();return 1;End Method
    Method SimpleGetS:int();return 1;End Method
    Method SimpleGetT:int();return 1;End Method
    Method SimpleGetU:int();return 1;End Method
    Method SimpleGetV:int();return 1;End Method
    Method SimpleGetW:int();return 1;End Method
    Method SimpleGetX:int();return 1;End Method
    Method SimpleGetY:int();return 1;End Method
    Method SimpleGetZ:int();return 1;End Method
End Type

Global MyObject:TMyObject = New TMyObject
Global functionToFind:string = "SimpleGet"
local startTime1:int = Millisecs()
For Local i:Int = 0 To 1000000
    Local typeId:TTypeId = TTypeId.ForObject(MyObject)
    Local callable:TMember = typeId.FindMethod(functionToFind)
    If Not callable Then callable = typeId.FindFunction(functionToFind)
    If callable
        if TFunction(callable)
            TFunction(callable).Invoke(MyObject)
        Elseif TMethod(callable)
            TMethod(callable).Invoke(MyObject)
        endif
    endif
Next
print "1.000.000 calls to find method/function took: " +(Millisecs() - startTime1)+"ms."

end

I got ~700ms if I commented out the methods before and ~3600ms when having them enabled. So it might be a good thing to check out faster "findMethod/Function/Constant" approaches.

Now using the _nameLowerCase approach it reduces this to 520ms (disabled methods) or 1100ms (enabled methods).

GWRon commented 6 years ago

~Minor question: There is "findMethod" only utilizing a simple ident. How do handle overloaded methods there?~

Just saw Brucey's issue at bmx-ng/brl.mod/issues/50 - which already tackles that question

woollybah commented 6 years ago

Yes...

1.000.000 calls to find method/function took: 12242ms.
TTypeId.ForObject   : 39 ms
typeId.FindMethod   : 12014 ms
typeId.FindFunction : 24 ms
Invoke              : 155 ms

On the scale of things, it could maybe do with some work...

Adding a NameLower() method to Member, and calling that where appropriate...

1.000.000 calls to find method/function took: 2549ms.
TTypeId.ForObject   : 76 ms
typeId.FindMethod   : 1948 ms
typeId.FindFunction : 28 ms
Invoke              : 470 ms

which appears to make a reasonable difference.

woollybah commented 6 years ago

And yes, a TList is not very efficient when the lists are long. But... doing string compares over Map is not hugely efficient either.

In an ideal world, strings might have a hash that could be used when doing compares - which ends up more like comparing two ints - which is faster. Obviously it adds some bytes to each String, but makes certain uses of Strings much, much more efficient...

woollybah commented 6 years ago

Replacing _methods:TList with _methods:TStringMap, we now get :

1.000.000 calls to find method/function took: 1454ms.
TTypeId.ForObject   : 39 ms
typeId.FindMethod   : 104 ms
typeId.FindFunction : 22 ms
Invoke(method)      : 1273 ms

(all tests on 32-bit NG Win32)

So, from 12000ms down to 104ms appears to be a fairly significant performance improvement.

But it's all data dependent either way - whether your method is at the start of a list... or at the mid-point of a map... or at the end of a list, or at the far node of a map - you'll still get a lot of variation with your testing.

For example, in the above test, the "special" method is probably the first one it finds in the map. If you take out the block of methods that precede it alphabetically, times go up to over 1000ms...

It's easy to tweak your data to make your code look good ;-)

GWRon commented 6 years ago

Interesting stuff...invoke increased a big bit in your latest benchmark.

GC kicking in?

@ stringmap Hmm couldn't there be some StringHashMap? ...

woollybah commented 6 years ago

This test is a bit more thorough, running over all the variations... (run from a script)

SuperStrict
Framework BRL.StandardIO
'Import "../../external/reflectionExtended/reflection.bmx"
Import Brl.Reflection

Local meth:Int

If AppArgs.length > 1 Then
    meth = Int(AppArgs[1])
End If

Type TMyObject
    Field prop:Int = 10
    Field arr:Int[] = [1,2,3]
    Field count:Int

    Method SimpleGet0:Int()
        count :+ 1
        Return count
    End Method

    Method SimpleGet1:Int()
        count :+ 1
        Return count
    End Method

    Method SimpleGet2:Int()
        count :+ 1
        Return count
    End Method

    Method SimpleGet3:Int()
        count :+ 1
        Return count
    End Method

    Method SimpleGet4:Int()
        count :+ 1
        Return count
    End Method

    Method SimpleGet5:Int()
        count :+ 1
        Return count
    End Method

    Method SimpleGet6:Int()
        count :+ 1
        Return count
    End Method

    Method SimpleGet7:Int()
        count :+ 1
        Return count
    End Method

    Method SimpleGet8:Int()
        count :+ 1
        Return count
    End Method

    Method SimpleGet9:Int()
        count :+ 1
        Return count
    End Method

    Method SimpleGet10:Int()
        count :+ 1
        Return count
    End Method

    Method SimpleGet11:Int()
        count :+ 1
        Return count
    End Method

    Method SimpleGet12:Int()
        count :+ 1
        Return count
    End Method

    Method SimpleGet13:Int()
        count :+ 1
        Return count
    End Method

    Method SimpleGet14:Int()
        count :+ 1
        Return count
    End Method

    Method SimpleGet15:Int()
        count :+ 1
        Return count
    End Method

    Method SimpleGet16:Int()
        count :+ 1
        Return count
    End Method

    Method SimpleGet17:Int()
        count :+ 1
        Return count
    End Method

    Method SimpleGet18:Int()
        count :+ 1
        Return count
    End Method

    Method SimpleGet19:Int()
        count :+ 1
        Return count
    End Method

    Method SimpleGet20:Int()
        count :+ 1
        Return count
    End Method

End Type

Global MyObject:TMyObject = New TMyObject
Global functionToFind:String = "SimpleGet" + meth
Local loops:Int = 5
Local total:Int

For Local n:Int = 0 Until loops

Local startTime1:Int = MilliSecs()
For Local i:Int = 0 To 1000000
    Local typeId:TTypeId = TTypeId.ForObject(MyObject)
    Local callable:TMember = typeId.FindMethod(functionToFind)
    If Not callable Then callable = typeId.FindFunction(functionToFind)
    If callable
        If TFunction(callable)
            TFunction(callable).Invoke(MyObject)
        ElseIf TMethod(callable)
            TMethod(callable).Invoke(MyObject)
        EndIf
    EndIf
Next
    total :+ (MilliSecs() - startTime1)
Next

Print "1.000.000 calls to find '" + functionToFind + "' took: " +(total / loops)+"ms (avg over " + loops + " iterations)."

End

Average timings for 5 iterations of a million calls :

Using TList _methods :

1.000.000 calls to find 'SimpleGet0' took: 1845ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet1' took: 1633ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet2' took: 1885ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet3' took: 1641ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet4' took: 1664ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet5' took: 1722ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet6' took: 1743ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet7' took: 1790ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet8' took: 1989ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet9' took: 1692ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet10' took: 2044ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet11' took: 1837ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet12' took: 2071ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet13' took: 1936ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet14' took: 1998ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet15' took: 1788ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet16' took: 2015ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet17' took: 2016ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet18' took: 2331ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet19' took: 2081ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet20' took: 2260ms (avg over 5 iterations).

As you can see, there's a general trend towards higher times as the search has to test more names.

Using TStringMap _methods :

1.000.000 calls to find 'SimpleGet0' took: 1315ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet1' took: 1404ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet2' took: 1298ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet3' took: 1355ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet4' took: 1324ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet5' took: 1333ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet6' took: 1331ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet7' took: 1305ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet8' took: 1337ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet9' took: 1302ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet10' took: 1335ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet11' took: 1388ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet12' took: 1304ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet13' took: 1336ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet14' took: 1310ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet15' took: 1336ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet16' took: 1322ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet17' took: 1345ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet18' took: 1310ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet19' took: 1341ms (avg over 5 iterations).
1.000.000 calls to find 'SimpleGet20' took: 1377ms (avg over 5 iterations).

Now things look much more stable and consistent throughout, as you'd expect from a balanced map.

The general slowness of TList may be down to the implementation itself.

GWRon commented 6 years ago

Just saw your edits (only read the mails - which are of course unedited :-)). Always thought searching TMap is faster as it uses a red/black balanced tree ..

OK. There came your longer post - ignore above.

Think nonetheless that most speedup is done by getting rid of the "string.toLower()" in most cases.

@ TStringHashMap Would this help ? And - is it difficult to implement? Sounds at least as something which my TRegistry would like to use - as I use a "stringKey"=>"value" pair there.

Edit: @ slowness of TList TMap was always faster, dunno if the memory footprint is much higher (for each TTypeID.ForObject(...) the maps/lists are created and filled) or can be ignored. TMap might be better suited for the needed extension when it comes to "overload" support.