Closed lukas-w closed 7 years ago
Though I don't have a precise benchmark result, MemoryManager::alloc()
seems to take average >100μs per allocation(depends on how long have LMMS been run), and it gets worse when I keep using LMMS.
IMO new memory manager doesn't strictly need to be lock-free, but it should be fast enough and guarantee suitable worst-case performance.
Though I don't have a precise benchmark result, MemoryManager::alloc() seems to take average >100μs per allocation(depends on how long have LMMS been run), and it gets worse when I keep using LMMS.
I have a benchmark. Using the code in #3882 I exported the same file from the GUI using the default export settings, first with MemoryManager
:
Real Time: 32.93, User Time 67.99, System Time 15.32
Real Time: 35.36, User Time 75.25, System Time 15.18
Real Time: 39.34, User Time 81.92, System Time 16.12
Real Time: 43.00, User Time 88.04, System Time 16.53
See the slowdown on each successive run? Then a version using malloc & free instead:
Real Time: 19.81, User Time 43.14, System Time 14.27
Real Time: 19.73, User Time 42.96, System Time 14.49
Real Time: 19.86, User Time 43.95, System Time 14.73
I was so sure that my malloc/free hacking had left out something important that I listened to the exported wav file but no, it was fine... using MemoryManager the export took around 50% longer and got slower all the time. Kill it. Kill it now. Nuke it from orbit.
I appreciate that the Compiling Wiki docs were updated but FWIW IMO this was a pretty major change to land on the "stable" branch without at least including a note in the commit message that the addition was via a submodule so things wouldn't automatically continue to work.
Or maybe I just have exceptionally bad luck to hit this change, https://github.com/LMMS/lmms/commit/b621c7eb34cc0a1ec6db5e4dd2dad60f2e721f2a and https://github.com/LMMS/lmms/commit/0dbdafc1f82ea0f6cb14a07bad4caab8e49643fc all within a short time affecting both stable
& master
. :/
MemoryManager
causes a performance regression and should be replaced by a faster algorithm or, if we fail to find/implement one, justmalloc
andfree
, which still provides better performance than the current implementation.Here are a couple of existing lock-free allocators I found:
Streamflow (GitHub project).Edit: Implementation is POSIX-only. In case anyone's interested in the technical details, there's a paper describing the algorithm, including benchmarks: http://people.cs.vt.edu/~scschnei/papers/ismm06.pdfRelated issues: #2029, #3312, #3767