dougbinks / enkiTS

A permissively licensed C and C++ Task Scheduler for creating parallel programs. Requires C++11 support.
zlib License
1.66k stars 138 forks source link

Stuttering on Intel hybrid CPUs #99

Closed Liemarzac closed 11 months ago

Liemarzac commented 1 year ago

Hello,

I would like to get some thoughts about stalls that we see when using enkiTS on the new intel hybrid CPUs. They are made of 2 different types of cores: the Performance Cores (P-Cores) and the Efficient Cores (E-Cores). We observe a pattern where, very often, we see game frames to stall because some enkiTS threads are being preempted for a very long time or they are not woken up on time for execution. Unfortunately, our game becomes unplayable when this happens. Here is a capture taken with Superluminal profiler which highlights one of the stalls: hybrid_stall The red rectangles are the part of the game frame where we execute parallel tasks using enkiTS. We can see a big bubble in one of them because some threads were preempted (blue sections) and stalled the thread which scheduled that work until the "stolen" threads were scheduled to execute again. The CPU used for this capture is a i9-13900K with 8 P-Cores (16 logical processors) and 16 E-Cores. We have tried multiple changes to avoid the stalls. We tried to:

dougbinks commented 1 year ago

I haven't used Superluminal so could you let me know what the red, green and blue-ish colours mean (I can guess but would rather know for sure). Could you also label the threads which are on P cores and E cores?

Liemarzac commented 1 year ago

Sure thing. Red = In synchronisation (waiting for an event to happen to continue execution) Green = Execution Blue = Preempted

In this capture, each row is a thread.

This capture was taken before we tried setting soft affinities to P-Cores for enkiTS threads so they could run on any type of cores. However, even when they are using soft affinities for P-Cores, I would not be able to see on which core the thread is currently executing. It might be possible to have custom data fed to Superluminal to display this information, so I can have a look.

dougbinks commented 1 year ago

On Windows GetCurrentProcessorNumber can be used to get the current processor number. This might change during a threads execution.

Have you tried disabling E-Cores in the bios and running the same code which stalls? From your description of I am not sure that the issue is caused by the hybrid processor, instead I think it is being exaggerated by it.

The top thread is not meaningful.

Do you mean it is not an enkiTS thread? What is the purple state it is in most of the time?

Can you expand the threads to show the tasks they are running? If you cannot show this information in public then you can email me at doug@enkisoftware.com.

Instrumenting the ProfilerCallbacks to show the waitFor* callbacks along with your own task start/stop will be useful in helping diagnose what is happening.

The stall appears to be about 8x longer than the other red patches. This is much longer than I would expect to see from having the slow path switch from a P-core to an E-core. At one point 5 threads are being pre-empted whilst 7 threads are inactive. This is also very odd, and I would take a look at overall CPU utilization and see if there is something else running.

You might find VTune is helpful in exploring what is going on as this has Hybrid CPU Analysis. VTune can be downloaded for free: https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html

Liemarzac commented 1 year ago

Have you tried disabling E-Cores in the bios and running the same code which stalls? From your description of I am not sure that the issue is caused by the hybrid processor, instead I think it is being exaggerated by it.

Yes we have seen the problem disappearing when E-Cores are disabled in the BIOS.

Do you mean it is not an enkiTS thread? What is the purple state it is in most of the time?

It is a thread I should have cropped from my screenshot because it is not meaningful for the problem we see. The purple indicates it is sleeping.

Can you expand the threads to show the tasks they are running? If you cannot show this information in public then you can email me at doug@enkisoftware.com.

I will provide more information on this privately, but I can say that it is executing physics code in parallel at the time it stalls.

Instrumenting the ProfilerCallbacks to show the waitFor* callbacks along with your own task start/stop will be useful in helping diagnose what is happening.

I will add markers to give more visibility on this.

The stall appears to be about 8x longer than the other red patches. This is much longer than I would expect to see from having the slow path switch from a P-core to an E-core. At one point 5 threads are being pre-empted whilst 7 threads are inactive. This is also very odd, and I would take a look at overall CPU utilization and see if there is something else running.

Some players report they use VR at that time, others report the same problem without any specific app usage in the background.

You might find VTune is helpful in exploring what is going on as this has [Hybrid CPU Analysis]

It's a good shout.

dougbinks commented 1 year ago

It might be worth trying to replicate the problem with a simple example such as the enkiTSMicroprofileExample.cpp.

My understanding of the profile image is that only about 3-7 logical processors of the CPU are available for computation out of around 32 which should be available unless the work being given to enkiTS is only able to be split up into ~10 or so subtasks (ranges). This might indicate that the real problem is some other process is taking up the majority of CPU time.

You could also try setting the process priority to an increased level such as HIGH_PRIORITY_CLASS prior to initializing enkiTS, or running a pinned task on every enkiTS thread (including the scheduling thread) to call SetThreadPriority with a high priority.

dougbinks commented 12 months ago

Have you made any progress in investigating this?

I don't currently have a hybrid CPU to perform any testing, and I don't have enough information to make an educated guess as to what's going on here.

dougbinks commented 11 months ago

I'm going to close this issue as I've not heard back.

If you have further information and are still experiencing the issue please do re-open.