SuperTux / supertux

SuperTux source code
https://supertux.org
GNU General Public License v3.0
2.51k stars 481 forks source link

SuperTux eats 100% of CPU while in menu #1977

Open szborows opened 2 years ago

szborows commented 2 years ago

master / Linux agu 5.10.0-9-amd64 #1 SMP Debian 5.10.70-1 (2021-09-30) x86_64 GNU/Linux

Expected behavior

no 100% CPU utilization (one thread)

Actual behavior

100%

Steps to reproduce actual behavior

Open game, play, and hit ESC to pause the game while the level is loaded.

Additional debugging information

if I will have more time I will provide some probe-type profiler info

szborows commented 2 years ago

I'm going to treat GitHub comments as chat from now on. This is because I have limited time, it's very late and I'm unsure when I will come back to this. I also hope that if I fail somebody else can start from the point I ended up provided information I communicate here.

I looked into CMakeFile and found use of CMAKE_CXX_FLAGS_PROFILE.

I tried cmake -DCMAKE_BUILD_TYPE=profile .. but no gmon.out file was generated after running the game. I know I'm doing something wrong, since profiling flags are in CMake file...

Semphriss commented 2 years ago

Since you said you have limited time and in case you're still here: -DCMAKE_BUILD_TYPE=Profile with uppercase P according to here.

I'm going to investigate on this.

Semphriss commented 2 years ago

What CPU model do you have?

szborows commented 2 years ago

@Semphriss it's Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz

I created a paste with contents of /proc/cpuinfo here: https://pastebin.com/QcvL4CSu

szborows commented 2 years ago

Since you said you have limited time and in case you're still here: -DCMAKE_BUILD_TYPE=Profile with uppercase P according to here.

The absolute state of C++ and its ecosystem :smile_cat:

FYI: I purged build dir and re-ran with uppercase P, but still no files were generated that match following GLOBs: *prof*, *gmon*.

EDIT: game seems to be a little bit more laggy, so I think it actually changed something. I will look into this. I also tried --developer switch but it did nothing (as I anticipated - it was blind shot).

szborows commented 2 years ago

I used make VERBOSE=1 to see what CMake actually does. The -pg flag is there. I'm gonna check if -g3 or something like that is required too.

And also I'm gonna go see GCC docs what exactly -pg does.

/usr/bin/c++ -DBOOST_ALL_NO_LIB -DBOOST_ATOMIC_DYN_LINK -DBOOST_CHRONO_DYN_LINK -DBOOST_DATE_TIME_DYN_LINK -DBOOST_FILESYSTEM_DYN_LINK -DBOOST_LOCALE_DYN_LINK -DBOOST_SYSTEM_DYN_LINK -DBOOST_THREAD_DYN_LINK -DGLM_ENABLE_EXPERIMENTAL -I/home/I_hid_my_ass/repo/supertux/build -I/home/I_hid_my_ass/repo/supertux/src -isystem /home/I_hid_my_ass/repo/supertux/build/physfs/include -isystem /usr/include/AL -isystem /home/I_hid_my_ass/repo/supertux/build/squirrel/ex/include -isystem /home/I_hid_my_ass/repo/supertux/build/tinygettext/include -isystem /home/I_hid_my_ass/repo/supertux/build/SDL_ttf/include/SDL2 -isystem /home/I_hid_my_ass/repo/supertux/external/findlocale -isystem /home/I_hid_my_ass/repo/supertux/external/obstack -isystem /home/I_hid_my_ass/repo/supertux/external/sexp-cpp/include -isystem /home/I_hid_my_ass/repo/supertux/external/SDL_SavePNG -isystem /home/I_hid_my_ass/repo/supertux/external/partio_zip -isystem /usr/include/SDL2 -isystem /usr/lib -Wall -Wextra -Wno-unused-parameter -funit-at-a-time -fno-strict-aliasing   -pg -std=c++14 -o CMakeFiles/supertux2_lib.dir/src/audio/stream_sound_source.cpp.o -c /home/I_hid_my_ass/repo/supertux/src/audio/stream_sound_source.cpp
szborows commented 2 years ago

FYI

hid@myass:~/repo/supertux/build$ nm supertux2 | grep -i gmon
                 w __gmon_start__
szborows commented 2 years ago

I found out that for 'hello, world' cmake following works, but still one has to pass extra params, so instead of cmake .. I had to type cmake -DCMAKE_CXX_FLAGS=-pg -DCMAKE_EXE_LINKER_FLAGS=-pg -DCMAKE_SHARED_LINKER_FLAGS=-pg ..

gmon.out is generated. I'm compiling supertux with such invocation right now. Will let know what happens ;)

==> a.cpp <==
int main() {}

==> CMakeLists.txt <==
cmake_minimum_required (VERSION 2.8.11)
project (HELLO)
set(CMAKE_CXX_FLAGS_PROFILE "-pg" CACHE STRING "Profile flags")
set(CMAKE_C_FLAGS_PROFILE "-pg" CACHE STRING "Profile flags")
set(CMAKE_LD_FLAGS_PROFILE "-pg" CACHE STRING "Profile flags")
add_executable (helloDemo a.cpp)

EDIT: yes I know, these lines in CMakeLists.txt are superfluous

szborows commented 2 years ago

I have gmon.out for SuperTux! Yay! :) Let's see what's inside in case you didn't solve it yet :)

BTW I noticed that not only while being in menu it eats up 100% CPU. Bad guy or "Nolok"

szborows commented 2 years ago

I cannot upload gmon.out file here unfortunately. I tried to upload results of gprof supertux2 gmon.out > ImpfenMachtFrei.txt | xclip but PasteBin didn't accept it (because of its size).

I will either find some solution to upload these files or analyze it on myself.

In general thanks for being involved!

Semphriss commented 2 years ago

I'm not really familiar with performance analysis :) Let me know if you find possible improvements, they would be very welcome.

Semphriss commented 2 years ago

@szborows Any news about this?

szborows commented 2 years ago

The only thing I managed to do is to pinpoint one of the biggest consumers of time using kcachegrind.

I am still willing to work on this. I need to better understand what the rendering engine does in that place, though.

szborows commented 2 years ago

@Semphriss I think I'm at the point where I need some help or at least guidance. My findings so far are:

First one is that it actually doesn't matter whether you are in menu screen or not - CPU consumption stays at similar high level.

GLPainter::draw_texture accounts for 35% of execution, and vector::insert is used inside it, which accounts for 19.5%. This looked a little bit odd, so I tried to optimize it by replacing insertion mechanism with memcpy. It was a blind shot.

screenshot

The changes look like following diff (it's just an excerpt).

-      auto uvs_lst = {
+      const float uvs_lst[] = {
         uv_left, uv_top,
         uv_right, uv_top,
         uv_right, uv_bottom,
@@ -123,7 +124,9 @@ GLPainter::draw_texture(const TextureRequest& request)
         uv_left, uv_top,
         uv_right, uv_bottom,
       };
-      m_uvs.insert(m_uvs.end(), std::begin(uvs_lst), std::end(uvs_lst));
+
+      memcpy(m_vertices.data() + (i * vertices_lst_len), vertices_lst, vertices_lst_len * sizeof(float));
+      memcpy(m_uvs.data() + (i * vertices_lst_len), uvs_lst, vertices_lst_len * sizeof(float));

However, by using pidstat I found out that it makes things even a little bit worse. I think this is not dead end. Farting into memory with small portions (12 * sizeof(float)) may contribute to poor performance. Apart from rotated textures this could be circumvented if memory could be passed right away to GL functions. This would obviously require some major rearrangements.

Since such a primitive workaround didn't work, I resorted to at least limiting CPU load while in menu. This is how I started.

diff --git a/src/supertux/game_session.cpp b/src/supertux/game_session.cpp
index 0e0bacdc9..3c5a0e485 100644
--- a/src/supertux/game_session.cpp
+++ b/src/supertux/game_session.cpp
@@ -197,9 +198,15 @@ GameSession::on_escape_press(bool force_quick_respawn)
   }
 }

+bool
+GameSession::is_game_paused() {
+    return m_game_pause;
+}
+
 void
 GameSession::toggle_pause()
 {
   // pause
   if (!m_game_pause && !MenuManager::instance().is_active())
   {
diff --git a/src/supertux/game_session.hpp b/src/supertux/game_session.hpp
index a576fa90a..1dd4f8d82 100644
--- a/src/supertux/game_session.hpp
+++ b/src/supertux/game_session.hpp
@@ -80,6 +80,7 @@ public:
   bool reset_button;
   bool reset_checkpoint_button;

+  bool is_game_paused();
   void toggle_pause();
   void abort_level();
   bool is_active() const;
diff --git a/src/supertux/screen_manager.cpp b/src/supertux/screen_manager.cpp
index 5cd1a0855..dcb8b1160 100644
--- a/src/supertux/screen_manager.cpp
+++ b/src/supertux/screen_manager.cpp
@@ -262,12 +264,16 @@ ScreenManager::draw_player_pos(DrawingContext& context)
 ScreenManager::draw(Compositor& compositor, FPS_Stats& fps_statistics)
 {
   assert(!m_screen_stack.empty());

+  GameSession * session = GameSession::current();
+
   // draw the actual screen
-  m_screen_stack.back()->draw(compositor);
+  if (session && !session->is_game_paused()) {
+    m_screen_stack.back()->draw(compositor);
+  }

   // draw effects and hud
   auto& context = compositor.make_context(true);

It helped w.r.t. CPU load (on my other machine it went down from ~56% to ~15%), but obviously it messes up other stuff in the game:

Hence, this is not an option, I looked for different places in the code where something can be done, but now I understand less than yesterday :smile_cat:

I believe the best option is to make the background screen still. But I have no clue what would be best option to freeze it.

  1. We can calculate big surface out of what was displayed before going into menu and have only one request with one item. This doesn't sound like professional thing to me to be honest. It would also cause troubles when window resize happens.
  2. We can stop processing requests (m_requests) when game is paused and let the painter somehow draw what was in GL buffer last time. However, in the video/ folder there are no references to GameSession and introducing it looks like breaking current architecture and is probably bad idea.
  3. We can limit these requests only to the ones that intersect with bounding box of the menu. That won't give us results we look for, and also it would be shady to future current/contributors.

Please share your thoughts, if you have any.

MrGaming-18113 commented 2 years ago

Im agree jijijija

tobbi commented 3 months ago

Does this happen in the latest nightly as well? https://download.supertux.org/