LibreELEC / LibreELEC.tv

Just enough OS for KODI
http://libreelec.tv
2.26k stars 1.12k forks source link

[BUG] High Idle CPU Utilisation on Tanix TX6 #6157

Open heitbaum opened 2 years ago

heitbaum commented 2 years ago

Problem

# top -H is showing that the Allwinner H6 chip in the Tanix TX6 is running high utilisation (even though there is no active processing.)

image

doing an strace -p on the kodi.bin process shows the following.

 (37 times) - clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=10000000}, NULL) = 0
   ppoll([{fd=11, events=POLLIN}], 1, {tv_sec=0, tv_nsec=0}, NULL, 0) = 0 (Timeout)
   futex(0x1bb3908, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=76306, tv_nsec=595425620}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out)
   futex(0xfb2f880, FUTEX_WAKE_PRIVATE, 1) = 0
   pselect6(14, [13], NULL, NULL, {tv_sec=0, tv_nsec=0}, NULL) = 0 (Timeout)
 (37 times) - clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=10000000}, NULL) = 0
   ppoll([{fd=11, events=POLLIN}], 1, {tv_sec=0, tv_nsec=0}, NULL, 0) = 0 (Timeout)
   futex(0x1bb3908, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=76306, tv_nsec=595425620}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out)
   futex(0xfb2f880, FUTEX_WAKE_PRIVATE, 1) = 0
   pselect6(14, [13], NULL, NULL, {tv_sec=0, tv_nsec=0}, NULL) = 0 (Timeout)
...

This processing should result in IDLE and a CPU of Zero.

Testing to date

I have tested the following:

None of these updates / changes has made any appreciable difference / fix,

Discovery

In gdb - thread 1 - is looping on the below (so assuming that this is the "CPU load we are seeing".

tx6:~ # gdb -p 9288
GNU gdb (GDB) 11.1
...
Attaching to process 9288
...
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
0x0000ffffa9aa8830 in clock_nanosleep () from /usr/lib/libc.so.6
(gdb) bt
#0  0x0000ffffa9aa8830 in clock_nanosleep () from /usr/lib/libc.so.6
#1  0x0000ffffa9aad158 in nanosleep () from /usr/lib/libc.so.6
#2  0x0000ffffa9ad2a50 in usleep () from /usr/lib/libc.so.6
#3  0x0000000000bb3b0c in CGraphicContext::Flip(bool, bool) ()
#4  0x0000000000f4a79c in ?? ()
#5  0x0000000000fcf214 in CXBApplicationEx::Run(CAppParamParser const&) ()
#6  0x0000000000719f2c in main ()

Reading

lrusak commented 2 years ago

Maybe it's not waiting for vsync, you can try extending the sleep (make it 40) here: https://github.com/xbmc/xbmc/blob/master/xbmc/windowing/gbm/WinSystemGbmGLESContext.cpp#L148

jernejsk commented 2 years ago

Did you try perf top? It should show which function spends most time executing. However, I'm not sure if this command requires debug symbols or not. I believe it would be easier to debug this issue with debug symbols enabled, regardless of method. Sadly, size of Kodi's debug symbols exploded in the past and it's not feasible to load them in debugger directly on target.

heitbaum commented 2 years ago

Maybe it's not waiting for vsync, you can try extending the sleep (make it 40) here: https://github.com/xbmc/xbmc/blob/master/xbmc/windowing/gbm/WinSystemGbmGLESContext.cpp#L148

@lrusak - looks like you found the line of code to manipulate it - first go. 👍

6AAF7CE3-3286-44C3-BE4A-76D2B5671299

jernejsk commented 2 years ago

@lrusak where is syscall which waits on vsync located? I'll try to debug DRM driver to see what could be improved.

heitbaum commented 2 years ago

Did you try perf top? It should show which function spends most time executing. However, I'm not sure if this command requires debug symbols or not. I believe it would be easier to debug this issue with debug symbols enabled, regardless of method. Sadly, size of Kodi's debug symbols exploded in the past and it's not feasible to load them in debugger directly on target.

@jernejsk - I did now - as the below - it settled on this as the top:

67910AB4-D11E-464E-9664-D82415B3AFE0

jernejsk commented 2 years ago

@heitbaum Can you remove https://github.com/xbmc/xbmc/blob/master/xbmc/windowing/gbm/WinSystemGbm.cpp#L275-L278 and https://github.com/xbmc/xbmc/blob/master/xbmc/windowing/gbm/WinSystemGbm.h#L74 (with no other modification) and see if this changes something?

heitbaum commented 2 years ago

@heitbaum Can you remove https://github.com/xbmc/xbmc/blob/master/xbmc/windowing/gbm/WinSystemGbm.cpp#L275-L278 and https://github.com/xbmc/xbmc/blob/master/xbmc/windowing/gbm/WinSystemGbm.h#L74 (with no other modification) and see if this changes something?

Compiling up now 👍

As a FYI - I booted back into an arm32 userland image (from the 15th Jan) - pretty much the same - 5.16.1-rc1. This is without the https://github.com/xbmc/xbmc/blob/master/xbmc/windowing/gbm/WinSystemGbmGLESContext.cpp#L148 and CPU running @ 20% image

jernejsk commented 2 years ago

Nothing changed much in kernel and Kodi (for rendering) for some time now. I guess this would be long time present bug, just nobody really dig into it.

heitbaum commented 2 years ago

Here is the output with your patch: CPU between 6.6-7.8%. Think you are right that it has been there for a while. I only noticed that it was on my desk for a few days and was looking into the temp/freq/governor and subsequently load issues.

--- a/xbmc/windowing/gbm/WinSystemGbmGLESContext.cpp    2022-01-23 05:52:41.460933783 +0000
+++ b/xbmc/windowing/gbm/WinSystemGbmGLESContext.cpp    2022-01-23 05:52:09.374274477 +0000
@@ -145,7 +145,7 @@
   }
   else
   {
-    KODI::TIME::Sleep(10);
+    KODI::TIME::Sleep(40);
   }
 }

--- a/xbmc/windowing/gbm/WinSystemGbm.cpp       2022-01-13 22:29:43.000000000 +0000
+++ b/xbmc/windowing/gbm/WinSystemGbm.cpp       2022-01-23 08:34:41.571847518 +0000
@@ -267,11 +267,6 @@
     resource->OnLostDisplay();
 }

-std::unique_ptr<CVideoSync> CWinSystemGbm::GetVideoSync(void* clock)
-{
-  return std::make_unique<CVideoSyncGbm>(clock);
-}
-
 std::vector<std::string> CWinSystemGbm::GetConnectedOutputs()
 {
   return m_DRM->GetConnectedConnectorNames();
--- a/xbmc/windowing/gbm/WinSystemGbm.h 2022-01-13 22:29:43.000000000 +0000
+++ b/xbmc/windowing/gbm/WinSystemGbm.h 2022-01-23 08:35:53.161579968 +0000
@@ -71,8 +71,6 @@
 protected:
   void OnLostDevice();

-  std::unique_ptr<CVideoSync> GetVideoSync(void* clock) override;
-
   std::shared_ptr<CDRMUtils> m_DRM;
   std::unique_ptr<CGBMUtils> m_GBM;
   std::shared_ptr<CVideoLayerBridge> m_videoLayerBridge;

0FAC1594-34B4-45AD-8F24-220E53728585

jernejsk commented 2 years ago

As I said, you should not do any other change. KODI::TIME::Sleep(40); defeats the purpose. I'm interesting if my change also lowers the load.

heitbaum commented 2 years ago

Hi @jernejsk - doesn’t seem to reduce without the kodi::TIME::Sleep(40);

5853E1B8-65B5-4CBE-B708-12C196C6ACED

jernejsk commented 2 years ago

what is CPU load? perf top shows only most called/time consuming functions, not actual load.

heitbaum commented 2 years ago

Sitting at 21-25% CPU

E5AF3277-F216-49BC-9BCE-98CD15FEFBA0

jernejsk commented 2 years ago

Thanks for testing! Hopefully @lrusak knows what to check/fix.

lrusak commented 2 years ago

The atomic commit itself should be blocking as we don't use the asynchronous flag.

There is also some info in this old PR: https://github.com/xbmc/xbmc/pull/14266