The CircularBuffer was significantly updated, it uses newly developed AtomicBitset to check for available slots.
Other changes:
Remove CircularBuffer::Writer::publish() and update unit tests accordingly.
reserve() and tryReserve(). Some functions still use blocking reserve().
Fix the issue with the unintentionally disabled HAS_POSIX_MAP_INTERFACE.
Updated qa_buffer to always test four variants: Multi-Posix, Multi-Portable, Single-Posix, Single-Portable.
Known issues:
warning: implicit conversion changes signedness: 'signed_index_type' (aka 'long') to 'std::size_t' (aka 'unsigned long') will be fixed when one changes the type of signed_index_type to index_type in one of the next PR.
reserve() vs. tryReserve()
bm_Buffer test does not work with tryReserve()
Protection agains std::size_t overflow -> can be implemented together with changing signed_index_type from signed to unsigned
AtomicBitset extend with bulk operations
Add _readMinCursorCached to improve performance
Review Sequence::compareAndSet()
Special thanks to @RalphSteinhagen and @wirew0rm for their help and discussions during work on this PR.
Performance results:
all micro-benchmarks passed:
┌──────────────────benchmark:──────────────────┬──────┬──CPU branch misses───┬─ops/s─┬──mean──┬─<CPU-I>─┬────CPU cache misses────┬─stddev─┬─#N─┬─CTX-SW─┬─total time─┬──min───┬─median─┬──max───┐
│ │ SKIP │ │ │ │ │ │ │ │ │ │ │ │ │
│ POSIX: 1 producers -< 1 >-> 1 consumers │ PASS │ 59.8k / 519k = 11.5% │ 25.7M │ 39 ms │ 302m │ 11.9k / 101k = 11.8% │ 5 ms │ 8 │ 105 │ 312 ms │ 36 ms │ 37 ms │ 53 ms │
│ POSIX: 1 producers -< 1 >-> 2 consumers │ PASS │ 75.7k / 666k = 11.4% │ 18.8M │ 53 ms │ 387m │ 5.35k / 100.0k = 5.4% │ 709 us │ 8 │ 112 │ 425 ms │ 52 ms │ 53 ms │ 55 ms │
│ POSIX: 1 producers -< 1 >-> 4 consumers │ PASS │ 71.6k / 617k = 11.6% │ 12.3M │ 82 ms │ 359m │ 18.5k / 122k = 15.2% │ 3 ms │ 8 │ 120 │ 653 ms │ 79 ms │ 81 ms │ 88 ms │
│ POSIX: 2 producers -< 1 >-> 1 consumers │ PASS │ 70.8k / 586k = 12.1% │ 5.9M │ 170 ms │ 341m │ 37.1k / 154k = 24.1% │ 22 ms │ 8 │ 135 │ 1 s │ 136 ms │ 182 ms │ 192 ms │
│ POSIX: 2 producers -< 1 >-> 2 consumers │ PASS │ 74.5k / 641k = 11.6% │ 5.6M │ 178 ms │ 372m │ 15.1k / 132k = 11.5% │ 10 ms │ 8 │ 138 │ 1 s │ 153 ms │ 182 ms │ 185 ms │
│ POSIX: 2 producers -< 1 >-> 4 consumers │ PASS │ 85.4k / 741k = 11.5% │ 5.6M │ 179 ms │ 431m │ 8.81k / 118k = 7.4% │ 1 ms │ 8 │ 136 │ 1 s │ 177 ms │ 178 ms │ 181 ms │
│ POSIX: 4 producers -< 1 >-> 1 consumers │ PASS │ 88.4k / 744k = 11.9% │ 5.1M │ 195 ms │ 433m │ 35.7k / 162k = 22.1% │ 3 ms │ 8 │ 149 │ 2 s │ 189 ms │ 196 ms │ 199 ms │
│ POSIX: 4 producers -< 1 >-> 2 consumers │ PASS │ 97.6k / 852k = 11.5% │ 5.3M │ 190 ms │ 496m │ 8.51k / 142k = 6.0% │ 10 ms │ 8 │ 161 │ 2 s │ 178 ms │ 194 ms │ 202 ms │
│ POSIX: 4 producers -< 1 >-> 4 consumers │ PASS │ 142k / 1.24M = 11.4% │ 3.5M │ 289 ms │ 725m │ 12.1k / 222k = 5.5% │ 6 ms │ 8 │ 267 │ 2 s │ 275 ms │ 291 ms │ 293 ms │
├──────────────────────────────────────────────┼──────┼──────────────────────┼───────┼────────┼─────────┼────────────────────────┼────────┼────┼────────┼────────────┼────────┼────────┼────────┤
│ POSIX: 1 producers -<1024>-> 1 consumers │ PASS │ 33.2k / 298k = 11.1% │ 3.3G │ 299 us │ 173m │ 1.59k / 10.8k = 14.8% │ 582 us │ 8 │ 9 │ 2 ms │ 24 us │ 25 us │ 2 ms │
│ POSIX: 1 producers -<1024>-> 2 consumers │ PASS │ 32.4k / 297k = 10.9% │ 10.1G │ 99 us │ 173m │ 1.43k / 7.32k = 19.5% │ 150 us │ 8 │ 3 │ 789 us │ 40 us │ 43 us │ 496 us │
│ POSIX: 1 producers -<1024>-> 4 consumers │ PASS │ 52.6k / 454k = 11.6% │ 3.9G │ 253 us │ 265m │ 1.56k / 15.5k = 10.1% │ 212 us │ 8 │ 12 │ 2 ms │ 155 us │ 156 us │ 801 us │
│ POSIX: 2 producers -<1024>-> 1 consumers │ PASS │ 65.0k / 561k = 11.6% │ 219M │ 5 ms │ 328m │ 6.44k / 62.4k = 10.3% │ 48 us │ 8 │ 64 │ 36 ms │ 4 ms │ 5 ms │ 5 ms │
│ POSIX: 2 producers -<1024>-> 2 consumers │ PASS │ 59.2k / 511k = 11.6% │ 217M │ 5 ms │ 298m │ 6.63k / 62.0k = 10.7% │ 19 us │ 8 │ 64 │ 37 ms │ 5 ms │ 5 ms │ 5 ms │
│ POSIX: 2 producers -<1024>-> 4 consumers │ PASS │ 71.1k / 614k = 11.6% │ 224M │ 4 ms │ 359m │ 4.78k / 54.5k = 8.8% │ 261 us │ 8 │ 64 │ 36 ms │ 4 ms │ 5 ms │ 5 ms │
│ POSIX: 4 producers -<1024>-> 1 consumers │ PASS │ 67.8k / 586k = 11.6% │ 315M │ 3 ms │ 342m │ 5.20k / 55.5k = 9.4% │ 541 us │ 8 │ 57 │ 25 ms │ 3 ms │ 3 ms │ 5 ms │
│ POSIX: 4 producers -<1024>-> 2 consumers │ PASS │ 72.8k / 629k = 11.6% │ 313M │ 3 ms │ 367m │ 4.20k / 51.4k = 8.2% │ 508 us │ 8 │ 57 │ 26 ms │ 3 ms │ 3 ms │ 5 ms │
│ POSIX: 4 producers -<1024>-> 4 consumers │ PASS │ 67.1k / 583k = 11.5% │ 329M │ 3 ms │ 340m │ 4.27k / 55.7k = 7.7% │ 21 us │ 8 │ 56 │ 24 ms │ 3 ms │ 3 ms │ 3 ms │
├──────────────────────────────────────────────┼──────┼──────────────────────┼───────┼────────┼─────────┼────────────────────────┼────────┼────┼────────┼────────────┼────────┼────────┼────────┤
│ portable: 1 producers -< 1 >-> 1 consumers │ PASS │ 69.7k / 605k = 11.5% │ 28.0M │ 36 ms │ 353m │ 12.3k / 103k = 11.9% │ 777 us │ 8 │ 104 │ 286 ms │ 35 ms │ 36 ms │ 37 ms │
│ portable: 1 producers -< 1 >-> 2 consumers │ PASS │ 66.9k / 530k = 12.6% │ 17.0M │ 59 ms │ 308m │ 47.1k / 146k = 32.4% │ 6 ms │ 8 │ 112 │ 472 ms │ 53 ms │ 57 ms │ 67 ms │
│ portable: 1 producers -< 1 >-> 4 consumers │ PASS │ 69.7k / 606k = 11.5% │ 8.2M │ 122 ms │ 352m │ 7.67k / 114k = 6.7% │ 4 ms │ 8 │ 128 │ 973 ms │ 112 ms │ 124 ms │ 126 ms │
│ portable: 2 producers -< 1 >-> 1 consumers │ PASS │ 77.6k / 533k = 14.5% │ 6.5M │ 155 ms │ 310m │ 106k / 223k = 47.5% │ 13 ms │ 8 │ 129 │ 1 s │ 142 ms │ 151 ms │ 187 ms │
│ portable: 2 producers -< 1 >-> 2 consumers │ PASS │ 71.1k / 625k = 11.4% │ 5.5M │ 183 ms │ 364m │ 7.00k / 122k = 5.7% │ 3 ms │ 8 │ 136 │ 1 s │ 177 ms │ 185 ms │ 187 ms │
│ portable: 2 producers -< 1 >-> 4 consumers │ PASS │ 86.5k / 759k = 11.4% │ 5.5M │ 182 ms │ 442m │ 4.40k / 117k = 3.7% │ 4 ms │ 8 │ 142 │ 1 s │ 178 ms │ 182 ms │ 188 ms │
│ portable: 4 producers -< 1 >-> 1 consumers │ PASS │ 86.2k / 755k = 11.4% │ 5.1M │ 198 ms │ 439m │ 19.2k / 138k = 13.9% │ 6 ms │ 8 │ 155 │ 2 s │ 186 ms │ 201 ms │ 206 ms │
│ portable: 4 producers -< 1 >-> 2 consumers │ PASS │ 93.6k / 815k = 11.5% │ 4.9M │ 205 ms │ 474m │ 8.79k / 140k = 6.3% │ 3 ms │ 8 │ 163 │ 2 s │ 202 ms │ 204 ms │ 210 ms │
│ portable: 4 producers -< 1 >-> 4 consumers │ PASS │ 137k / 1.21M = 11.4% │ 3.6M │ 275 ms │ 703m │ 13.2k / 224k = 5.9% │ 23 ms │ 8 │ 264 │ 2 s │ 217 ms │ 284 ms │ 292 ms │
├──────────────────────────────────────────────┼──────┼──────────────────────┼───────┼────────┼─────────┼────────────────────────┼────────┼────┼────────┼────────────┼────────┼────────┼────────┤
│ portable: 1 producers -<1024>-> 1 consumers │ PASS │ 58.8k / 503k = 11.7% │ 2.3G │ 430 us │ 294m │ 1.63k / 17.3k = 9.4% │ 581 us │ 8 │ 16 │ 3 ms │ 154 us │ 288 us │ 2 ms │
│ portable: 1 producers -<1024>-> 2 consumers │ PASS │ 59.4k / 510k = 11.6% │ 3.4G │ 294 us │ 298m │ 1.55k / 15.0k = 10.3% │ 221 us │ 8 │ 14 │ 2 ms │ 153 us │ 155 us │ 799 us │
│ portable: 1 producers -<1024>-> 4 consumers │ PASS │ 57.3k / 491k = 11.7% │ 3.2G │ 315 us │ 287m │ 1.91k / 20.4k = 9.3% │ 68 us │ 8 │ 17 │ 3 ms │ 288 us │ 289 us │ 495 us │
│ portable: 2 producers -<1024>-> 1 consumers │ PASS │ 64.7k / 564k = 11.5% │ 146M │ 7 ms │ 329m │ 2.36k / 65.8k = 3.6% │ 33 us │ 8 │ 72 │ 55 ms │ 7 ms │ 7 ms │ 7 ms │
│ portable: 2 producers -<1024>-> 2 consumers │ PASS │ 62.5k / 546k = 11.4% │ 145M │ 7 ms │ 318m │ 1.97k / 62.2k = 3.2% │ 40 us │ 8 │ 72 │ 55 ms │ 7 ms │ 7 ms │ 7 ms │
│ portable: 2 producers -<1024>-> 4 consumers │ PASS │ 72.5k / 630k = 11.5% │ 146M │ 7 ms │ 367m │ 1.67k / 64.8k = 2.6% │ 22 us │ 8 │ 72 │ 55 ms │ 7 ms │ 7 ms │ 7 ms │
│ portable: 4 producers -<1024>-> 1 consumers │ PASS │ 76.1k / 660k = 11.5% │ 224M │ 4 ms │ 385m │ 1.88k / 57.8k = 3.3% │ 237 us │ 8 │ 65 │ 36 ms │ 4 ms │ 5 ms │ 5 ms │
│ portable: 4 producers -<1024>-> 2 consumers │ PASS │ 74.2k / 645k = 11.5% │ 225M │ 4 ms │ 377m │ 2.75k / 54.2k = 5.1% │ 1 ms │ 8 │ 64 │ 36 ms │ 3 ms │ 5 ms │ 7 ms │
│ portable: 4 producers -<1024>-> 4 consumers │ PASS │ 75.9k / 659k = 11.5% │ 218M │ 5 ms │ 385m │ 3.10k / 55.5k = 5.6% │ 66 us │ 8 │ 65 │ 37 ms │ 5 ms │ 5 ms │ 5 ms │
└──────────────────────────────────────────────┴──────┴──────────────────────┴───────┴────────┴─────────┴────────────────────────┴────────┴────┴────────┴────────────┴────────┴────────┴────────┘
The bm_Buffer benchmark test indicated significant performance degradation for the multi-producer case following this commit: https://github.com/fair-acc/gnuradio4/commit/5bd3eed14076427b1d50997ae6c2a77863876121.
The
CircularBuffer
was significantly updated, it uses newly developedAtomicBitset
to check for available slots.Other changes:
reserve()
andtryReserve()
. Some functions still use blockingreserve()
.HAS_POSIX_MAP_INTERFACE
.qa_buffer
to always test four variants: Multi-Posix, Multi-Portable, Single-Posix, Single-Portable.Known issues:
warning: implicit conversion changes signedness: 'signed_index_type' (aka 'long') to 'std::size_t' (aka 'unsigned long')
will be fixed when one changes the type ofsigned_index_type
toindex_type
in one of the next PR.reserve()
vs.tryReserve()
bm_Buffer
test does not work withtryReserve()
std::size_t
overflow -> can be implemented together with changingsigned_index_type
from signed to unsignedAtomicBitset
extend with bulk operationsSequence::compareAndSet()
Special thanks to @RalphSteinhagen and @wirew0rm for their help and discussions during work on this PR.
Performance results: