termux / termux-x11

Termux X11 add-on application.
https://termux.dev
GNU General Public License v3.0
2.14k stars 314 forks source link

[Bug]: RSS memory of surfaceflinger keeps growing. #522

Closed junrenshi closed 8 months ago

junrenshi commented 11 months ago

Problem description

The RSS memory of surfaceflinger grows (>10M) each time one switches termux-x11 to another app and back, or turns screen off and back on. After a few days of normal usuage, surfaceflinger could comsume a few GB RSS memory and renders the whole system unstable.

This seems to happen only when termux-x11 is running.

I check termux-x11 log, some errors are notable and may be related to the issue:

12-20 10:13:44.804 29946 30029 E BufferQueueProducer: query: BufferQueue has been abandoned 12-20 10:13:44.804 29946 30029 E BufferQueueProducer: query: BufferQueue has been abandoned 12-20 10:13:44.805 29946 30029 E BufferQueueProducer: SurfaceView[com.termux.x11/com.termux.x11.MainActivity]#25(BLAST Consumer)25 connect: BufferQueue has been abandoned 12-20 10:13:44.846 29946 29992 E BufferQueueProducer: SurfaceView[com.termux.x11/com.termux.x11.MainActivity]#26(BLAST Consumer)26 query: BufferQueue has been abandoned 12-20 10:13:44.847 29946 31356 E BufferQueueProducer: SurfaceView[com.termux.x11/com.termux.x11.MainActivity]#26(BLAST Consumer)26 setAsyncMode: BufferQueue has been abandoned 12-20 10:13:44.848 29946 31356 E BufferQueueProducer: SurfaceView[com.termux.x11/com.termux.x11.MainActivity]#26(BLAST Consumer)26 dequeueBuffer: BufferQueue has been abandoned 12-20 10:13:44.849 29946 31356 E BufferQueueProducer: SurfaceView[com.termux.x11/com.termux.x11.MainActivity]#26(BLAST Consumer)26 query: BufferQueue has been abandoned x11.log

I am running Androud 12 in a Boox Tab10C Pro. The version of apk is Nightly Release 20231006 download from github, and the termux package is termux-x11-nightly 1.03.00-3.

What steps will reproduce the bug?

  1. Start termux-x11
  2. In a terminal, check memory ususage using "dumpsys meminfo"
  3. Switch to another app, or turn off the screen
  4. Switch back to termux-x11, or turn on screen
  5. Check memory usuage again.

What is the expected behavior?

The memory consumption of surfaceflinger does not keep growing, and one can continue running termux-x11 for a long period of time.

twaik commented 10 months ago

Can you please check what exactly triggers clearing surfaceflinger RSS memory? I mean force-stopping com.termux.x11 application from Android UI or killing termux-x11 process.

junrenshi commented 10 months ago

Killing com.termux.x11 does not clear the surfaceflinger RSS memory. Actually, after com.termux.x11 restarts, the memory grows.

The memory is cleared when I stop WM and exit the X server.

twaik commented 10 months ago

What clears surfaceflinger RSS memory? WM stopping or killing X-server (which is started by termux-x11 command)?

junrenshi commented 10 months ago

Killing X-server clears the memory.

The memory is cleared when I stop the X session, i.e., when Termux-X11 App shows a black screen showing "Not connected".

After killing com.termux.com, com.termux.com will restart. In this case, I can still see the X session, and the memory is not cleared.

I can also kill app_process. In this case, the X server will stop, and the memory is cleared.

twaik commented 10 months ago

Again. X session and X server are two different things. X session can not work without X server, but X server can work without X sessions. Killing X session will not kill X server but killing X server will make all X sessions die. Does killing X session (including or excluding WM) affect surfaceflinger? Sorry if it looks like I am mad or rude, it is not so.

twaik commented 10 months ago

com.termux.x11 (which is Android application) is pretty dumb client. It only sends native surface and inputs to X server so killing it will not kill X server or X session.

junrenshi commented 10 months ago

Thanks for clarifying the terminology. To clear the memory, one needs to kill the X server.

twaik commented 10 months ago

Calling Surface.release() should fix this but as we see here it does not help. Can you please write me in approximately 8 hours?

junrenshi commented 10 months ago

Please look into this issue when you have time. Currently I have to use VNC+XRDP+RDClient(8). Termux-X11 is obviously superior in both speed and keyboard support. If the memory issue can be resolved, it will be an ideal X-Window solution for termux.

twaik commented 10 months ago

Try this build. app-universal-debug.zip

junrenshi commented 10 months ago

Thanks for spending time to investigate and fix the issue. I try the build. I find that the memory still keeps growing.

I did a test. I started termux-x11, and switched between it and RDclient back and forth. Each time I recorded the RSS memory of surfaceflinger by running "dumpsys meminfo | grep surfaceflinger". The following is the the RSS memory usages recorded (in MB):

192 (T) -> 182 (R) -> 183 (T) -> 219 (R) -> 210 (T) -> 228 (R) -> 246 (T) -> 264 (R) -> 264 (T) -> 283 (R) -> 311 (T) -> 337 (R) -> 338 (T) -> 374 (R) -> 374 (T)

where (T) and (R) indicate that the memory is checked under Termux-X11 and RDclient, respectively. We can see that the memory still keeps growing. The memory increases often (but not always) happen when switching from Termux-X11 to RDClient. But sometimes switching RDclient to Termux-X11 also induces large increases.

I also find that there are still BufferQueue errors. But the number of the errors emitted for each switching is significantly smaller than before: it used to be tens, and now it is 3 most times, but sometimes it still emits tens errors. The memory increase trajectory also seems to be better. Before, the memory always increases in both the directions of the switchings, and now it could decrease sometimes. It seems to indicate that most of the memory leaks have been plugged, but remaining ones still cause problem.

The debug log is here: x11.log

Thank you again for your time and effort.

twaik commented 10 months ago

Can you please measure RSS with switching between Termux:X11 and, let's say, Home/Launcher? RDClient can have it's overhead too so I can not check if memory is consumed by termux-x11 or by RDClient. Launcher should be always in memory so it should not cause RSS growing. And there is one more test you can do: just enter Termux:X11 (with termux-x11, but without graphical environment, so you can see only X-shaped cursor), measure RSS, enable screen rotation and measure RSS again after each screen rotation. 15 measurements should be enough.

junrenshi commented 10 months ago

Yes, I did some additional tests:

(1) Switch between termux-x11 (running i3 and two urxvt windows) and Home: 109 -> 155 -> 155 -> 201 -> 228 -> 274 -> 283 -> 310 -> 337 -> 365 -> 391 -> 438

(2) Switch between an empty termux-x11 and termux (where I check the memory): 146 -> 155 -> 187 -> 223 -> 260 -> 296 -> 337 -> 369 -> 405

(3) Running termux-x11 with i3 and two urxvt windows, turn on/off the screen: 101 -> 128 -> 128 -> 146 -> 164 -> 182 -> 200 -> 236

twaik commented 10 months ago

Can you please try check the memory via ssh from different device? Switching to termux may have memory impact too.

twaik commented 10 months ago

Turning on/off screen may not trigger creating new native surfaces, screen rotating is much better.

junrenshi commented 10 months ago

Sure. Now I start an empty Termux:X11, then rotate it and the Home screen. The memory is checked using wireless adb. The memory usages are:

223 -> 219 -> 246 -> 264 -> 319 -> 337 -> 392 -> 392 -> 428 -> 427 -> 464 -> 490

twaik commented 10 months ago

Rotate only Termux:X11, without going to launcher.

junrenshi commented 10 months ago

I see. Now I rotate the screen (portrait <-> landscape). The memory does not grow in this case.

twaik commented 10 months ago

And now only switch between Termux:X11 and launcher with Home button on screen...

junrenshi commented 10 months ago

When using the Home button to switch, the memory grows:

120 -> 147 -> 173 -> 179 -> 214 -> 229

JanuszChmiel commented 9 months ago

Please how to run dumpsys meminfo | grep surfaceflinger When I Am running Ubuntu executed by using Proot. DCommand can not be executed. is it possible to start needed memory monitoring service in user mode? If I have understood it correctly, you are using some Virtual machine executed by using Qemu. May be, that you are using some new phone Googles Pixel with Kqemu and special new CPU. But I will atleast test memory allocations and system stability by The following way. I will have Mate desktop environment run for A whole night. If my phone will become unstable during todays night, screen reader will speak something and it will wake Me to examine if something was hhappened. Sure. If Android kernel will not perform An automatic reboot. But now, build in Huawei MUI service which monitor amount of free memory do not report Me, that I have 400 MB from 4 GB free. So I will be patient and I will be testing too.

JanuszChmiel commented 9 months ago

I have tried to run Termux-x11 with Ubuntu AArch64 Bit edition and Proot. I have executed Mate desktop environment with Orca screen reader. I have turned my screen off for A whole night. Mate-session and other background Mate desktop environment special tasks have run in background. no chaos during The night. no problems. Orca and Espeak and Speech-dispatcher has worked perfectly. i could even start Tuner Internet radio written in Vala and I could listen my favourite 80S music in The morning. So I think, that Termux-x11 is reliable piece of software. MR Twaik, very well done. Every Android version is specific. Phone manufacturers can even modify some Android parts. Even Android kernel can be different then The original fetched from Android source code repository. Because every manufacturer has team of Android kernel C developers. So I think, that may be, that Huawei contain some advanced memory management functions. So memory allocations of surface flinger are not so dangerous for system stability. I will test your Termux-x11 on Android 11, if Termux contain repositoryes with precompiled Termux packages for Android 11.

junrenshi commented 9 months ago

Thank you for your test. However,

(1) You have to have a rooted device to run the command "dumpsys meminfo | grep surfaceflinger" ; (2) The memory increases when you switch Termux:X11 to another app or the home screen, and switch it back. Or you can switch screen on and off when running Termux:X11. After a number of cycles of switching, the memory consumption of surfaceflinger will become significantly larger than that it starts with. The memory will not increase when just using Termux:X11 (actually it is very stable and usable) or leaving it idle. It is not surprising that you did not observe the problem because of your test method.

BTW, the latest release does not solve the issue.

twaik commented 8 months ago

@junrenshi Please, check https://github.com/termux/termux-x11/actions/runs/7949739741 behaviour.

twaik commented 8 months ago

Ok, you can check it later. I do not see any difference on my devices, but the way I implemented it is a bit better it was implemented before.

junrenshi commented 8 months ago

Great! The build fixes the issue. It now runs perfectly. The memory consumed by surfaceflinger is now roughly constant.

Thanks for the great work!