markus-wa / demoinfocs-golang

A Counter-Strike 2 & CS:GO demo parser for Go (demoinfo)
https://pkg.go.dev/github.com/markus-wa/demoinfocs-golang/v4/pkg/demoinfocs?tab=doc
MIT License
715 stars 95 forks source link

Request for CS2 parser optimisation suggestions #425

Open Ektaros opened 1 year ago

Ektaros commented 1 year ago

Hello!

I am currently working on optimising cs2 parser. cs:go parser is fast, but cs2 is several times slower. I could not figure out why myself tho :( Mb you have an idea what parts of the code could be responsible for this difference in speed between cs:go and cs2?

Thanks in advance!

markus-wa commented 1 year ago

Hey @Ektaros - thanks for opening this issue. I have noticed the same thing and would definitely like to improve the performance.

Unforunately with the CS2 demo format being quite different from CS:GO this is not super easy and will take some time.

It's also unlikely it will ever be as fast as CS:GO, but hopefully we can at least make it a bit faster over time.

Ektaros commented 1 year ago

Hello @markus-wa I continue to look into the parser performance and I noticed this, please give your comments In this function you do a check to see if provided name is good

image

This check seems to always be ok on my demo pool (although it is not particularly big) Removing it provides noticeable performance increase. Here are the measurements from my machine:

Before: File size: 81 MB Elapsed: 2.73s File size: 101 MB Elapsed: 3.75s File size: 159 MB Elapsed: 7.12s

After: File size: 81 MB Elapsed: 2.34s File size: 101 MB Elapsed: 3.38s File size: 159 MB Elapsed: 6.03s

Are you sure this check is still necessary?

markus-wa commented 1 year ago

yeah that check is definitely necessary - but we can probably optimise it by caching this on *Entity.

I'm pretty confident that caching will be comparable in performance to removing the check.

nice find!

Ektaros commented 1 year ago

Ok thanks! I will look into this

esbengc commented 1 year ago

If may, I actually find it rather strange that field paths are managed in a pool. They do not look at all like heavy weight objects to me. I suspect that the overhead from the synchronization built into the pool may be much higher than if we simply created new instances as needed.

markus-wa commented 1 year ago

I think the issue was GC overhead as the field paths are allocated on the heap.

If we add caching this may no longer be necessary.

Probably best to do some A/B testing with pprof and look at both speed and alloc_objects

markus-wa commented 1 year ago

Or maybe we can make sure they are allocated on the stack instead

Ektaros commented 1 year ago

Thanks for the suggestion @esbengc

I tried that, removing the pool on the current master provides ~5-7% increase in speed, but there are ~10-15% more mem usage and allocated objects. Which is fine trade off I think.

I do not have any idea how to make sure that fp objects are allocated on the stack, cause they get passed around a lot. Mb you have a suggestion? @markus-wa

markus-wa commented 1 year ago

yeah that indeed sounds like a good tradeoff.

as for stack alloc we'd need to remove all pointers to them (and then pray to the Go gods) - e.g. by returning the value instead of passing a pointer through the whole call chain.

but I think your suggested changes are good enough as is and with caching I doubt this would actually make much of a difference. so let's leave it as is unless profiling shows we still have issues in this area