Could you please open a follow-up PR that expands either our "build from source" documentation or our documentation on "how to debug/profile" to explain how to switch between "release" and "debugoptimized" via by using a pip commandline flag and explains the expected impact (in terms of binary size and ability to use a debugger/profiler for native code).
In particular it would be interesting to see the impact of this switch when using a profiler such as linux perf (see this page for Python 3.12 specific integration) on a Python script that relies heavily on native code (e.g. fitting HistGradientBosttingClassifier which is mostly Cython).
And similarly check that it works as expected for py-spy's support for native extension profiling:
From https://github.com/scikit-learn/scikit-learn/pull/29594#issuecomment-2260154987 and https://github.com/scikit-learn/scikit-learn/pull/29594#issuecomment-2260158387: