ianlancetaylor / demangle

C++ symbol name demangler written in Go
BSD 3-Clause "New" or "Revised" License
165 stars 24 forks source link

perf: Reduce allocations #20

Open brancz opened 11 months ago

brancz commented 11 months ago

We have some code that uses this library, and we noticed a fairly significant amount of allocations being done (~30% overall and up to 70% in C++ demangling). Would you be up for adding the ability to allow reusing objects and slices? I do think it would require some API changes, or at the very least some new APIs.

Here's profiling data with the stacks up until the usage of this library anonymized: https://pprof.me/892d0e0/

ianlancetaylor commented 11 months ago

The approach taken for C++ demangling is definitely allocation heavy. It converts the name into an AST, and then generates the demangled name using the AST. This is done using a separate allocation for each AST node. So, yes, it adds up. It's also not trivial to fix, because each AST node is different. What is your overall use case?

That said, sure, I don't object to reusing objects and slices. Did you have a specific approach in mind?

brancz commented 11 months ago

What is your overall use case?

We use this library during symbolization in the Parca continuous profiling project. Demangling happens at query time (the idea being that the minimum demangling is used by default but more complex type parameters etc. can be requested on demand). Being in the query path we try to optimize everything as much as possible so query times are low.

Did you have a specific approach in mind?

I haven't looked at it deeper yet other than glancing over the profiling data above. My first thought just went to pooling, but I'd be happy to try any ideas you might have!

ianlancetaylor commented 11 months ago

The problem with pooling with the current design is that the AST is composed of many different kinds of nodes, currently I think 62 nodes. So we would potentially need 62 pools. That said I'm sure some node types are more common than others. It might be interesting to gather some stats on that.

Zxilly commented 6 months ago

An easier implement can be using string intern. For eaxmple, with go4.org/intern, after applying this the same string will only need to be allocated one time.