long latency when push times are much bigger than queue size.

colderleo commented 2 years ago

Hi

I tried a test pushing 1000000 times into the queue, and when the queue size is small like 64 or 1024, the latency is quite long. and when the queue size is big like 65535, the latency becomes low.

queue size 1024, push 10000 times, per use 0.3 us. queue size 1024, push 1000000 times, per use 13 us

    using namespace chrono;

    vector<thread> threads;
    rigtorp::MPMCQueue<int *> q(1024);
    atomic<int> push_times{0};
    atomic<int> pop_times{0};

    {
        steady_clock::time_point t1 = steady_clock::now();
        for(int i=0; i<Consumer_num; ++i) {
            threads.emplace_back(
                [&] {
                    while(pop_times.load()<test_size) {
                        int* temp;
                        q.pop(temp);
                        pop_times.fetch_add(1);
                    }
                }
            );
        }
        while(push_times<test_size) {
            int* temp;
            q.push(temp);
            push_times.fetch_add(1);
        }
        steady_clock::time_point t2 = steady_clock::now();
        duration<double, micro> time_span = (duration<double, micro>)(t2 - t1);

        this_thread::sleep_for(chrono::microseconds(500));
        cout<<"rigtorp::MPMCQueue: push " << test_size << " data, per cost us:" << (time_span).count()/test_size <<endl;
        if(push_times.load() != pop_times.load()) {
            cout<<"warning: push_times=" << push_times.load() << ", pop_times=" << pop_times.load() <<endl;
        }
    }
    exit(0);

rigtorp commented 2 years ago

If I understand correctly, you have one producer and multiple consumers.

Queue size of 16 is very small. There is a trade off between memory usage and (latency and throughput). Your benchmark includes a lot of overhead from launching threads etc, please take a look at how my benchmarks are constructed to avoid that.

On Fri, Oct 29, 2021 at 10:06 AM colderleo @.***> wrote:

Hi

I tried a 1000000 times push test, when the queue size is small like 16, the latency is quite long. and when the queue size is big like 65535, the latency becomes low.

queue size 1024, push 10000 times, per use 0.3 us. queue size 1024, push 1000000 times, per use 13 us
using namespace chrono;

vector<thread> threads;
rigtorp::MPMCQueue<int *> q(1024);
atomic<int> push_times{0};
atomic<int> pop_times{0};

{
    steady_clock::time_point t1 = steady_clock::now();
    for(int i=0; i<Consumer_num; ++i) {
        threads.emplace_back(
            [&] {
                while(pop_times.load()<test_size) {
                    int* temp;
                    q.pop(temp);
                    pop_times.fetch_add(1);
                }
            }
        );
    }
    while(push_times<test_size) {
        int* temp;
        q.push(temp);
        push_times.fetch_add(1);
    }
    steady_clock::time_point t2 = steady_clock::now();
    duration<double, micro> time_span = (duration<double, micro>)(t2 - t1);

    this_thread::sleep_for(chrono::microseconds(500));
    cout<<"rigtorp::MPMCQueue: push " << test_size << " data, per cost us:" << (time_span).count()/test_size <<endl;
    if(push_times.load() != pop_times.load()) {
        cout<<"warning: push_times=" << push_times.load() << ", pop_times=" << pop_times.load() <<endl;
    }
}
exit(0);
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/rigtorp/MPMCQueue/issues/32, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABLO2ZXHUZAPPXAL22LKADUJJIX7ANCNFSM5G6WVAYA .

colderleo commented 2 years ago

Thank you!

Sorry I have 4 consumers and 1 producer.

Are your benchmarks in MPMCQueueExample.cpp?

I can't understand why push times effect the latency. As I have only 4 consumers and 1 producer, and the queue size is 1024, which is much bigger than consumer number. In this situation, 10000 push times is good, but 1000000 will take very long latency .

Besides, as the program may run for very long time, I can't predict how many push times it will need. So how shold I define the queue size?

I read the source code and it is very fabulous! Avoiding many threads racing on one atomic variable, produce can be parallel and no need to wait others, which I haven't seen on other queue. So I think it should have better performance.

rigtorp commented 2 years ago

No I should publish them at some point.

Do you have more threads than cores? This queue more or less requires that you have a thread per core architecture with each thread pinned to a core. Maybe in a future version I will support futex for efficiently waiting when queue is full or empty.

On Mon, Nov 1, 2021 at 2:45 AM colderleo @.***> wrote:

Thank you!

Sorry I have 4 consumers and 1 producer.

Are your benchmarks in MPMCQueueExample.cpp?

I can't understand why push times effect the latency. As I have only 4 consumers and 1 producer, and the queue size is 1024, which is much bigger than consumer number. In this situation, 10000 push times is good, but 1000000 will take very long latency .

Besides, as the program may run for very long time, I can't predict how many push times it will need. So how shold I define the queue size?

I read the source code and it is very fabulous! avoiding many threads racing on one atomic variable, produce can be parallel and no need to wait others, which I haven't seen on other queue.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/rigtorp/MPMCQueue/issues/32#issuecomment-955855497, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABLO23JD56U6LEP4UTW3ZDUJXWLBANCNFSM5G6WVAYA .

colderleo commented 2 years ago

Yes, I have 4 cores and 5 threads.

I'll try bind cpu later, thank you!

rigtorp / MPMCQueue

long latency when push times are much bigger than queue size. #32