Open pierricgimmig opened 8 months ago
No, but it's implemented using the C++11 memory model and should work on all CPU architectures.
On Sat, Nov 11, 2023 at 2:55 PM Pierric Gimmig @.***> wrote:
Has the queue been tested on ARM? Thanks!
— Reply to this email directly, view it on GitHub https://github.com/rigtorp/MPMCQueue/issues/43, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABLO2ZVUK3RYUHCGPAVJOLYD7Q3DAVCNFSM6AAAAAA7HP3JUWVHI2DSMVQWIX3LMV43ASLTON2WKOZRHE4DSMJRGUZTENY . You are receiving this because you are subscribed to this thread.Message ID: @.***>
The queue works fine on Apple Silicon M2 (ARM64). However, when I benchmark it and compare it to an Intel CPU, I get surprising results:
When there is no contention at all, the ARM version runs twice as fast as the Intel CPU. But with high contention, it's the reverse, and even worse: the ARM is almost 10 times slower. I have no explanation for this.
I even tried replacing compare_exchange_strong with compare_exchange_weak (which I think is sufficient, given that there is a loop), but that does not help.
Even using memory_order_acq_rel instead of the default memory_order_seq_cst for the CAS, does not improve performance (by the way, I wonder if using memory_order_seq_cst as the default is normal).
It would be great if other people could do the same kind of test.
hi @Philippe91 ! would you like to share your test program? thanks!
would you like to share your test program?
It's not possible due to dependencies on various parts of my framework. However, you would achieve similar results to those on my three computers. What would be interesting, on the other hand, are different benchmarks.
Has the queue been tested on ARM? Thanks!