Open tilkinsc opened 5 years ago
Sorry that I missed this! From what I can tell, these flag setting functions are being inlined. If we build the gameboy
package with the following command:
go build -gcflags -m
We can see messages like:
./state.go:169:6: can inline (*State).setHalfCarryFlag
I like your thinking though and I've also been on the lookout for function call overhead. Thanks for looking into this!
It was just something that came to mind. I fear that you shouldn't use function calls for this and manually do it yourself as carry flags and other flags have a definite setting per instruction. The trade off is maintainability perhaps. I am not sure the quality of inlining that goes on to be honest.
Poor inlining would be literally putting the function there. Good inlining is LTO-ish optimization post inline.
In my gameboy emulator written in C, I always count on LTO and -O2 to do their jobs. Although, yours is more developed.
One giant advantage of how I built my emulator is it is modular in the point to not optimizing how it works. This is like PPSSPP does with JIT and non-JIT runtimes. The other is my legendary source viewer. I have the best GB rom viewer in history all made by me.
Possibly, I can't speak to the quality of the inlining either. This would be fairly easy to benchmark though, I'll take a look!
I'm curious about this modular design! Is your code publicly available yet? I'd love to take a look!
No the code isn't publicly available nor is it finished.
The modular design just allows me to switch between code parsing targetings. There is room for a JIT, but I use my stupid optimizer. When I was saying it was a great advantage, it is like software rendering versus hardware rendering. Software rendering will get it right everytime, but hardware rendering may need some hacks to get around differences.
I'm not sure if I entirely understand. Are you parsing the Game Boy instructions into an intermediate representation that can be interpreted by a JIT or non-JIT run time?
Yes. Due to the scheme there should be no problem passing instructions through JIT processing. However, I can swap them out at leasiure. I haven't implemented JIT. Just some optimizations I threw on top of it. Such as cached functions who are always popping out the same values. They will continue to do so until some associated memory has changed. Therefore I would invalidate the cache. The way I designed it is to not actually check for memory changes each frame, moreso that the cached gets popped on set. Thank function pointers.
Does GO inline function calls for things like,
'setHalfCarryFlag' functions?
I wonder if an optimization could fruit from having a leaf function for each.]
Relevant: https://lemire.me/blog/2017/09/05/go-does-not-inline-functions-when-it-should/ https://www.reddit.com/r/golang/comments/6ypwui/go_does_not_inline_functions_when_it_should/ https://groups.google.com/forum/#!topic/golang-nuts/V_xI29FGDZM
As per the last one it might be obvious due to function branching for a one-switch-all format for a function to be used. You can easily double your speed as lemire.me blog said by manually inlining some things that aren't.