microsoft / STL

MSVC's implementation of the C++ Standard Library.
Other
10.07k stars 1.48k forks source link

Optimize `stacktrace` by avoid large allocation #3859

Open AlexGuteniev opened 1 year ago

AlexGuteniev commented 1 year ago

@achabense observed that pre-initializing internal vector in stacktrace to the maximum possible depth has noticeable performance impact:

https://github.com/microsoft/STL/blob/2261f7edb760eb3fe0726187c818b796dc7ea798/stl/inc/stacktrace#L142 https://github.com/microsoft/STL/blob/2261f7edb760eb3fe0726187c818b796dc7ea798/stl/inc/stacktrace#L303

The CaptureStackBackTrace API does not have a way for determining the needed amount in advance.

Currently we don't maintain own array management in stacktrace and using vector to avoid dealing in one more place with:

What could we do:

frederick-vs-ja commented 1 year ago
  • @StephanTLavavej suggested we can use smaller allocation on stack, and then try maximum if smaller overflow, otherwise copy data from the stack and not allocate large amount. Smaller could be 32 entries, which is 32*sizeof(void*) bytes,

See also https://github.com/microsoft/STL/pull/3850#discussion_r1255269678. It seems desired to introduce an internal small_vector-like container.

AlexGuteniev commented 2 weeks ago

An user suggested that we can use thread_local buffer as a temporary storage. Can we?

StephanTLavavej commented 2 weeks ago

On Discord, I mentioned that thread_local drags in TLS, which is problematic for users like Windows (which is why we avoid Magic Statics throughout the STL). Putting that into the import lib would be a recipe for headaches, so I think we need to avoid that.