SerenityOS / serenity

The Serenity Operating System 🐞
https://serenityos.org
BSD 2-Clause "Simplified" License
30.41k stars 3.18k forks source link

Kernel: `Processor::smp_wake_n_idle_processors` use over complicated copy ? #11424

Open mhermier opened 2 years ago

mhermier commented 2 years ago

Hi, I think there is an issue in the following code, detected while looking at bits functions: https://github.com/SerenityOS/serenity/blob/986d63466e50445b3d6df119da50532472f36219/Kernel/Arch/x86/common/Processor.cpp#L779-L784 When extracted to https://godbolt.org/z/rTqxascfK and ran it shows that idle_mask is always copied to found_mask in an over-engineered way. I don't fully understand what was the original intention, but I think this code can be simplified.

tomuta commented 2 years ago

IIRC it's supposed to only move idle_count number of bits from idle_mask to found_mask. In other words, it should only clear idle_count bits in idle_mask and set those bits in found_mask.

mhermier commented 2 years ago

Unless I'm mistaken this should be equivalent to:

u32 found_mask = 0;
swap(idle_mask, found_mask);
tomuta commented 2 years ago

I don't think that's true. This doesn't take idle_count into account. If idle_mask has more bits set than specified by idle_count then you end up with found_mask having too many bits set and idle_mask having no bits set at all.

mhermier commented 2 years ago

If idle_mask has a different number of bits set than idle_count has than it would be a volatile or a CPU error, since u32 idle_count = popcount(idle_mask);

tomuta commented 2 years ago

Ah yes I see what you mean. Actually, I think the loop condition is wrong. We should probably limit idle_count to a maximum of did_wake_count - wake_count. Something like this before the for loop:

if (idle_count > did_wake_count - wake_count)
    idle_count = did_wake_count - wake_count;

Otherwise there's a good chance we wake a whole lot more than we actually need. Maybe we can simplify this logic altogether to make it more straight forward.

mhermier commented 2 years ago

With such logic it would make sense, though I think it would be better to wake up CPU in a pseudo randomized way (to avoid thermal issues). I also mean, there are scenarios were you want to reserve some cores for dedicated tasks, or want to avoid some slow/fast cores on arch that have that. I known it is not really applicable to x86* (yet) but I think that the logic should be abstracted since only the apic part is arch dependent.