Closed NeilZhy closed 6 months ago
TCMalloc isn't the C++ compiler. TCMalloc is a memory allocator.
@junyer Yes, I know TCMalloc is a memory allocator. Please take a closer look at my question. My question is that after using tcmalloc to run the program, there are still a lot of dynamic memory allocations in my program caused by using std::vector.
Please clarify (a) what a memory allocator should have done, (b) how that could have been done and (c) which memory allocators would have done that. At the risk of stating the obvious, a memory allocator is a library that allocates memory; expecting it to "optimise" how it gets called is like expecting libc to "optimise" how it gets called to, say, fopen(3)
and fclose(3)
FILE
s. There's nothing here beyond wishful thinking.
a) these are not tcmalloc function names. are you sure that program using tcmalloc? b) tcmalloc has limited cpu cache bounds, so it's not true that there won't be any allocations after a few minutes
Indeed, various caches are bounded in various ways. Moreover, it's conceivable that at least some allocations are larger than kMaxSize
.
a) these are not tcmalloc function names. are you sure that program using tcmalloc? b) tcmalloc has limited cpu cache bounds, so it's not true that there won't be any allocations after a few minutes
a. I use tcmalloc to load my program in the following way.
LD_PRELOAD=path/to/gperftools/lib/libtcmalloc.so ./my_test_program
b. I noticed that github:tcmalloc has the MallocExtension::instance()->SetMaxPerCpuCacheSize
interface, but these interfaces are not in github:gperftools. I did some configuration by using the github:gperftools's MallocExtension::instance()->SetNumericProperty( "tcmalloc.max_total_thread_cache_bytes", 100 * 1024 * 1024);
interface, but it had no effect.
Is there some reason why this issue wasn't filed against https://github.com/gperftools/gperftools?
If you aren't using TCMalloc as provided by this project, is there some reason to seek support from this project? gperftools is a separate (older) project with a different (older) implementation of TCMalloc that, in particular, doesn't offer per-CPU caching.
I see you filed https://github.com/gperftools/gperftools/issues/1517, so I think we are done here.
Spamming unrelated projects is bad open source citizenship, IMO, but good luck with that.
It's also worth noting that even the issue title ("tcmalloc cannot optimize dynamic memory allocation caused by std::vector.") is at best a miscommunication and at worst a misrepresentation. It's unclear that there's any problem here beyond your misunderstanding of how to behave towards open source projects and how to respect people's time and attention.
I run my application with LD_PRELOAD=tcmalloc . My program uses the third-party library draco when running.The calling code is similar to the sample in draco.
I captured the dynamic memory allocation flame graph using the bcc tool.Combining perf, objdump, gdb and other tools, we found that a hotspot function with high memory usage is inside draco.The final draco code is as follows:
where data_ is defined as std::vector data。
Further analysis shows that every time the application calls the draco code, the entire draco will go through multiple nested calls to new to complete the definition of member variables. Therefore, every time data is used, it is a newly defined variable, which can actually be understood as a temporary variable.
My question is, why can't tcmalloc optimize away this part of dynamic memory allocation?