Closed acmorrow closed 4 years ago
The gperftools tcmalloc is quite widely used in the community for example it’s the default in ceph. It’s packaged in Ubuntu etc and had a new release as recently as April 2018 so this is definitely confusing messaging as it’s already very well established in the community.
Link to Ubuntu package for example: https://packages.ubuntu.com/eoan/libtcmalloc-minimal4
Comments in the linked gperftools issue suggest that the two projects are largely different (with distant relation) and at least not currently going to be interchangeable in various circumstances such as cross platform support etc.
I would suggest that this new tcmalloc should either take a new name or at the least be “tcmalloc3” or v3 (the gperftools version is currently v2.7) as it will cause quite a bit of community confusion otherwise I think.
+1 about renaming this project. No, don't use tcmalloc3 as this implies this is a newer version of gperftools, which it isn't.
+1 for renaming it. I would pefer naming it tcmalloc-ng :)
There are two projects on Github that are based on Google’s internal TCMalloc: This one and gperftools. Both are fast C/C++ memory allocators designed around a fast path that avoids synchronizing with other threads for most allocations.
Google open-sourced its memory allocator as part of “Google Performance Tools” in 2005. As discussed by Titus Winters in his 2017 CppCon Talk and the “Software Engineering at Google” book, it was easy for us to externalize code, but more difficult keeping it in-sync with our internal usage at that time. Subsequently, our internal implementation diverged from the code externally. This project eventually was adopted by the community as “gperftools.”
This repository is Google’s current implementation of TCMalloc, used by ~all of our C++ programs in production. The code is limited to the memory allocator implementation itself. Since “Profiling a Warehouse-Scale Computer” (Kanev 2015), we have invested in improving application productivity via optimizations to the implementation (per-CPU caches, sized delete, fast/slow path improvements, hugepage-aware backend).
Because this repository reflects our day-to-day usage, we’ve focused on the platforms we regularly use and can see extensive testing and optimization.
The configuration on Github mirrors our production defaults, with two notable exceptions:
tcmalloc::MallocExtension::ReleaseMemoryToSystem
, while others never release memory in favor of better CPU performance. These tradeoffs are discussed in our tuning page.MallocExtension
.Over time, we have found that configurability carries a maintenance burden. While a knob can provide immediate flexibility, the increased complexity can cause subtle problems for more rarely used combinations.
Like Abseil, we do not attempt to provide ABI stability. Providing a stable ABI could require compromising performance or adding otherwise unneeded complexity to maintain stability.
In addition to a memory allocator, the gperftools project contains a number of other tools:
malloc_extension.h
.perf
tool is decreasing our internal need for signal-based profiling. Additionally, with restartable sequences, signals interrupt the fastpath, leading to skew between the observed instruction pointer and where we actually spend CPU time.pprof
tool: This project is now developed in Go and is available on Github. alk@, the current maintainer of gperftools, plans to continue to work on that project. gperftools covers use cases this project does not support (stable ABIs, various platforms/OS's, etc.)
I'm really excited to see this new version of tcmalloc becoming available. In particular, the per-cpu support has long been an idea of interest. However, it is currently quite unclear how this new project compares with the existing gperftools/gperftools project. I think it would be helpful if this project contained some documentation that provided a direct comparison with the other (soon to be legacy?) project. Some roadmap and future directions content would be welcome as well. In no particular order:
The old gpeftools supported a wider array of platforms. On the OS side, Windows and macOS, at least to some degree. This project looks to currently be Linux only. Is support for those other operating systems planned? Explicitly out of scope? Similar questions regarding CPU. I note that ppc (presumably ppc64le?) is supported. But s390x (not surprising) and arm64 (quite surprising?) are absent. Are they on the horizon? Is work from the community to support those other platforms welcome?
What exactly has changed regarding support for CPU and heap profiling? It looks like they are more or less gone? Which is fine, at least for my use, I'd just like to know for sure either way.
Similar question regarding debugallocation. It seems that some of the classic debug allocator features that were part of gperftools may no longer be included. But at least use-after-free detection seems like it is still present, per some references to 0xcd? It probably makes sense to de-emphasize these sorts of features in world with ASAN, but some more information here would be welcome. And are there new interesting debugging features added?
What previously offered tunings or configurations have been removed or added?
What is the degree of stability of the code at this point? Should projects that have longstanding integrations with gperftoools be looking to switch now? If not, what are the gating changes?
Is there a release/tag/branch strategy? ABI stability goals? What should happen with packaging, especially for systems where the OS provides a "tcmalloc" package that derives from the old gperftools project?
What is the plan regarding synchronization between this project and the internal Google tcmalloc implementation? How open is the project to community contributions? Will those contributions be synced back to google, or will this eventually become another fork, as somewhat happened to gperftools?
I know that is a lot of questions, but I'm hopeful that putting some of the answers down in writing will help everyone who currently uses gperftools in their projects to understand how this new project should be approached.
I'd also like to thank you in advance for all the work that I am certain went into getting this new version of tcmalloc out into the world. Please don't take my long list of questions and concerns as anything other than deriving from a keen interest in the success of this new project.