Open jwinarske opened 1 year ago
@X-Terminator Thanks for the update.
For our cases we run x64 containers, x64 desktop (fedora/ubuntu), and aarch64 targets (yocto dunfell/kirkstone). No sign of a problem similar to your report.
There are many variables in a problem like this. In order to narrow this down, the expected order to deal with this is to come up with a minimum viable repro. The smallest possible scenario that replicates the problem.
Note that I am prepping the OSS release for ivi-homescreen -> flutter-auto. I should have it completed by end of day Wednesday (US).
So between the two the recommendation is update to latest release when it comes out, then attempt to repro your stability problem.
Also the upstream authority for ivi-homescreen/flutter-auto is https://github.com/toyota-connected/ivi-homescreen. So if issue is related to flutter-auto only, then it's best to raise an issue there.
@X-Terminator Another data point. flutter-auto is ivi-homescreen with some changes specific to AGL; which won't get picked up by Toyota. Unless you need the AGL-Compositor you would just use ivi-homescreen recipe.
@X-Terminator I have a repro on a TI SK-TDA4VM (j721e) (Arago tisdk-image-base, libweston-10_10.0.2, libgcc1_11.3.0). Took about three days to hit. What hardware/BSP and wayland compositor, weston+gcc versions hit the issue?
(gdb) info threads
Id Target Id Frame
* 1 Thread 0xffffb9a42020 (LWP 2413) "homescreen" 0x0000ffffb957c96c in ?? () from /lib/libc.so.6
2 Thread 0xffffb92ff0a0 (LWP 2414) "homescreen" 0x0000ffffb957c96c in ?? () from /lib/libc.so.6
3 Thread 0xffffb57580a0 (LWP 2415) "homescreen" 0x0000ffffb957c96c in ?? () from /lib/libc.so.6
4 Thread 0xffffae9ee0a0 (LWP 2416) "io.flutter.ui" 0x0000ffffb95e7dc4 in epoll_pwait () from /lib/libc.so.6
5 Thread 0xffffae1de0a0 (LWP 2417) "homescreen" 0x0000ffffb95de1a0 in poll () from /lib/libc.so.6
6 Thread 0xffffad9ce0a0 (LWP 2418) "io.flutter.io" 0x0000ffffb95e7dc4 in epoll_pwait () from /lib/libc.so.6
7 Thread 0xffff9ffff0a0 (LWP 2419) "io.worker.1" 0x0000ffffb957c96c in ?? () from /lib/libc.so.6
8 Thread 0xffff9f7ef0a0 (LWP 2420) "io.worker.2" 0x0000ffffb957c96c in ?? () from /lib/libc.so.6
9 Thread 0xffffad1be0a0 (LWP 2421) "dart:io EventHa" 0x0000ffffb95e7dc4 in epoll_pwait () from /lib/libc.so.6
(gdb) bt 10
#0 0x0000ffffb957c96c in ?? () from /lib/libc.so.6
#1 0x0000ffffb957f698 in pthread_cond_wait () from /lib/libc.so.6
#2 0x0000ffffb9986c84 in wl_display_read_events () from /usr/lib/libwayland-client.so.0
#3 0x0000aaaac8aefea8 in ?? ()
#4 0x0000ffffb952b230 in ?? () from /lib/libc.so.6
#5 0x0000ffffb952b30c in __libc_start_main () from /lib/libc.so.6
#6 0x0000aaaac8af3670 in ?? ()
@mv0 Does this ring a bell?
In my case it's looking like a TI/Imagination GPU driver crash, as the GPU is being reported as being powered off. The screen has a frozen image; the display output block is still being clocked.
------[ RGX Info ]------
Device Node (Info): 0000000040e4e319 (0000000037090564)
DevmemHistoryRecordStats - None
RGX BVNC: 22.104.208.318 (rogue)
RGX Device State: Active
RGX Power State: OFF
FW info: 23.1 @ 6404501 (release) build options: 0x80000810
TRP: HW support - No
WGP: HW support - No
RGX FW State: OK (HWRState 0x00000001: HWR OK;)
RGX FW Power State: RGXFWIF_POW_OFF (APM enabled: 2227406 ok, 10314 denied, 13 non-idle, 4373450 retry, 0 other, 6611196 total. Latency: 100 ms)
RGX DVFS: 0 frequency changes. Current frequency: 749.971 MHz (sampled at 219991527322094 ns). FW frequency: 100.000 MHz.
RGX FW OS 0 - State: active; Freelists: Ok; Priority: 0; Isolation group: 0; MTS off;
Number of HWR: GP(0/0+0), 2D(0/0+0), TA(3/3+0), 3D(0/0+0), CDM(0/0+0), FALSE(0,0,0,0,0)
DM 0 (GP)
DM 1 (HWRflags 0x00000000: working;)
DM 2 (HWRflags 0x00000000: working;)
Recovery 1: PID = 2413 / homescreen, frame = 68857, HWRTData = 0xC002A280, EventStatus = 0x00004400, Guilty Lockup
CRTimer = 0x00000003729A, OSTimer = 54608.178283060, CyclesElapsed = 48265984
PreResetTimeInCycles = 38912, HWResetTimeInCycles = 20480, FreelistReconTimeInCycles = 5344256, TotalRecoveryTimeInCycles = 5403648
Recovery 2: PID = 2413 / homescreen, frame = 98258, HWRTData = 0xC002A180, EventStatus = 0x00000600, Innocent Lockup
CRTimer = 0x00000000BFBF, OSTimer = 55344.178872365, CyclesElapsed = -9140480
PreResetTimeInCycles = 47872, HWResetTimeInCycles = 18944, FreelistReconTimeInCycles = 169472, TotalRecoveryTimeInCycles = 236288
BIF0 - FAULT:
* MMU status (0x0000000000001041): PC = 1, Page Size = 0 (Page Catalog).
* Request (0x00008b0000000000): TA (PPP Context State), Writing to 0x0000000000.
PC index (0) out of bounds (0)
Recovery 3: PID = 2413 / homescreen, frame = 4595599, HWRTData = 0xC002E640, EventStatus = 0x00004400, Guilty Lockup
CRTimer = 0x00000E3ACB47, OSTimer = 170198.506674416, CyclesElapsed = 47176960
PreResetTimeInCycles = 39424, HWResetTimeInCycles = 19456, FreelistReconTimeInCycles = 425216, TotalRecoveryTimeInCycles = 484096
DM 3 (HWRflags 0x00000000: working;)
DM 4 (HWRflags 0x00000000: working;)
RGX Kernel CCB WO:0x74 RO:0x74
RGX Firmware CCB WO:0x1C RO:0x1C
RGX Kernel CCB commands executed = 28507380
RGX SLR: Forced UFO updates requested = 0
RGX Errors: WGP:0, TRP:0
Thread0: FW IRQ count = 39815913
Last sampled IRQ count in LISR = 39815913
FW System config flags = 0x00020000 (Ctx switch options: Medium CSW profile;)
FW OS config flags = 0x0000000F (Ctx switch: TDM; GEOM; 3D; CDM;)
(!) RGX power is down. No registers dumped
We were able to revert back to Flutter version 3.3.7 and get flutter-auto working with our application. We have been testing the stability of running our application with
flutter-auto
over the last several weeks. Unfortunately we are seeing issues that the application sometimes stops updating the screen (animation on home screen stops, no screen updates on user input). The application is still running and still communicates with our back-end so it seems that only the screen updating is broken. There are no crash reports or debug messages printed when this happens so we are unsure if the issue is caused byflutter-auto
or our application. Restarting the application resolves the issue for a while, but after a few days the same thing happens again (screen freezes/stops updating). In parallel we have also some devices running our application with theflutter-pi
embedder and on these devices we have never seen this problem. Some of these devices have ben running for > 2 weeks without problems.Are there any known stability issues with flutter-auto that could explain this behavior? We typically only see this happening after the application has been running for a few days.
Originally posted by @X-Terminator in https://github.com/meta-flutter/meta-flutter/issues/295#issuecomment-1697179264