lpereira / lwan

Experimental, scalable, high performance HTTP server
https://lwan.ws
GNU General Public License v2.0
5.94k stars 548 forks source link

Thread scheduler is broken on linux #352

Closed pontscho closed 1 year ago

pontscho commented 1 year ago

Hi,

in that case the threads variable isn't zero, connections will point to bad thread descriptors. The problematic code snippet:

        n_threads = (uint32_t)lwan_nextpow2((size_t)((l->thread.count - 1) * 2));
        schedtbl = alloca(n_threads * sizeof(uint32_t));

        adj_affinity = topology_to_schedtbl(l, schedtbl, n_threads);

        n_threads--; /* Transform count into mask for AND below */

        for (unsigned int i = 0; i < total_conns; i++)
            l->conns[i].thread = &l->thread.threads[schedtbl[i & n_threads]];

and

for (unsigned int i = 0; i < l->thread.count; i++) {
    ...
}

For example if the l->thread.count contains one, the n_threads will be also one before the loop, but that for loop in the first snippet will reorder connections between thread #0 and #1 but that loop in the second snippet will initialise just thread #0. And after an http call will get this message:

2680608 lwan-thread.c:815 accept_waiting_clients() Could not add file descriptor 9 to epoll set 0. Dropping connection: Invalid argument (error number 22)

(I used to reproduce this problem fresh master and your hello example.)

Same thing happening if I try initialise two workers, that loop in the first snippet will use 4 contexts but second snippet will start just 2 threads.

I think that lwan_nextpow2() call isn't necessary there, without this rounding (n_threads = l->thread.count) everything works well.

pontscho

lpereira commented 1 year ago

I've made some changes that should fix this; could you please take a look?

On Tue, Jan 10, 2023, at 2:15 PM, pontscho wrote:

Hi,

in that case the threads variable isn't zero, connections will point to bad thread descriptors. The problematic code snippet:

` n_threads = (uint32_t)lwan_nextpow2((size_t)((l->thread.count - 1) 2)); schedtbl = alloca(n_threads sizeof(uint32_t));

    adj_affinity = topology_to_schedtbl(l, schedtbl, n_threads);

    n_threads--; /* Transform count into mask for AND below */

    for (unsigned int i = 0; i < total_conns; i++)
        l->conns[i].thread = &l->thread.threads[schedtbl[i & n_threads]];

` and

for (unsigned int i = 0; i < l->thread.count; i++) { ... } For example if the l->thread.count contains one, the n_threads will be also one, but that for loop in the first snippet will reorder connections between thread #0 and #1 https://github.com/lpereira/lwan/pull/1 but that loop in the second snippet will run just thread #0. And after an http call will get this message:

2680608 lwan-thread.c:815 accept_waiting_clients() Could not add file descriptor 9 to epoll set 0. Dropping connection: Invalid argument (error number 22) (I used for reproduce this problem fresh master and your hello example.)

I think that lwan_nextpow2() call isn't necessary there, without this rounding everything works well.

pontscho

— Reply to this email directly, view it on GitHub https://github.com/lpereira/lwan/issues/352, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAADVGPCH2Z4Z4FOVUDHR73WRVVJRANCNFSM6AAAAAATW3ITS4. You are receiving this because you are subscribed to this thread.Message ID: @.***>

pontscho commented 1 year ago

Hi,

it looks okay and works well.

Thanks for quick response!