sonyxperiadev / bug_tracker

Empty repository that is used as a bugtracker for Open Devices project
52 stars 13 forks source link

[Yoshino][Lilac] Phone subsystem errors due to sensor trouble #580

Closed stefanhh0 closed 4 years ago

stefanhh0 commented 4 years ago

Platform: Yoshino Device: Lilac Kernel version: 4.14.176-g19852ba16481 Android version: android-10.0.0_r36 Software binaries version: SW_binaries_for_Xperia_Android_10.0.7.1_r1_v6_yoshino.img Version baseband: 1307-7511_47.2.A.11.228 Build: aosp_g8441-userdebug 10 QQ2A.200501.001.B3 eng.stefan.20200515.182739

Previously working on Kernel version: 4.9.220-668039-g4f05f27b3aa9 Kernel compiler: gcc Android version: android-10.0.0_r36 Software binaries version: SW_binaries_for_Xperia_Android_9.0_2.3.2_v9_yoshino.img Version baseband: 1307-7511_47.2.A.11.228 Build: aosp_g8441-userdebug 10 QQ2A.200405.005 eng.stefan.20200501.025616

Description Booting up the phone leads sometimes phone subsystem issues. It is a similar behavior I have seen several times before and was in the past often due to sensor trouble.

grep -w "E" logcat.log |grep -i sensor

05-13 01:58:22.729   610  1362 E         : IGNORED register_direct_channel(mem=0x0, handle=-1) call to non-API-compliant sensor
05-13 01:58:23.392   994   994 E DisplayPowerController: failed to set up display white-balance: java.lang.IllegalStateException: cannot find light sensor
05-13 01:58:23.611   594   969 E libsensor1: wait_for_service: Service 0 is not available (5000 ms)
05-13 01:58:23.612   594   969 E mm-camera: <STATS_AIS ><ERROR> 65: dsps_proc_init: Failed to open sensor1 port
05-13 01:58:23.612   594   969 E mm-camera: <STATS_AIS ><ERROR> 1140: gyro_port_init_sensors: DSPS fail to init
05-13 01:58:23.617   594  1414 I mm-camera: <MCT   >< INFO> 5039: mct_pipeline_stop_session_thread: Stop module name: sensor - E
05-13 01:58:23.629   594  1423 I mm-camera: <MCT   >< INFO> 5227: mct_pipeline_start_session_thread: E sensor
05-13 01:58:23.682   594  1425 E mm-camera: <SENSOR><ERROR> 247: module_sensor_offload_init_config: func_tbl for submodule 4 is NULL
05-13 01:58:23.682   594  1425 E mm-camera: <SENSOR><ERROR> 249: module_sensor_offload_init_config: failed to get led off frame skip time
05-13 01:58:28.312   758   758 E Sensors : sns_main.c(447):Timeout waiting for SMGR service. Exit sensors daemon!
05-13 01:58:28.313   758   758 E Sensors : sns_main.c(561):Exiting! Err 0
05-13 01:58:28.343  2117  2120 E Sensors : sns_reg_la.c(121):read error, err:5, attempted:52, read:0
05-13 01:58:28.343  2117  2120 E Sensors : sns_reg.c(103):sns_reg_read failed; err 5
05-13 01:58:28.387  2117  2117 E Sensors : sns_debug_main.c(283):Debug Config File missing in EFS!
05-13 01:58:28.498   594  1411 E mm-camera: <STATS_AIS ><ERROR> 1844: dsps_sensor1_callback: Invalid Message Type
05-13 01:58:28.502   594  2127 I mm-camera: <MCT   >< INFO> 5039: mct_pipeline_stop_session_thread: Stop module name: sensor - E
05-13 01:58:28.509   594  2124 E libsensor1: resp_msg_cb: QMI transport error -1 from svc 0 on 0xf2e23580
05-13 01:58:29.074  1627  1627 E AndroidRuntime: java.lang.RuntimeException: Unable to start receiver com.sony.transmitpower.sensor.FeatureIntentReceiver: java.lang.IllegalStateException: Sensor is null.
05-13 01:58:29.074  1627  1627 E AndroidRuntime: Caused by: java.lang.IllegalStateException: Sensor is null.
05-13 01:58:29.074  1627  1627 E AndroidRuntime:        at com.sony.transmitpower.sensor.SensorBase.init(SensorBase.java:48)
05-13 01:58:29.074  1627  1627 E AndroidRuntime:        at com.sony.transmitpower.sensor.Accelerometer.init(Accelerometer.java:68)
05-13 01:58:29.074  1627  1627 E AndroidRuntime:        at com.sony.transmitpower.sensor.SensorFeature.init(SensorFeature.java:43)
05-13 01:58:29.074  1627  1627 E AndroidRuntime:        at com.sony.transmitpower.sensor.FeatureIntentReceiver.onReceive(FeatureIntentReceiver.java:35) 
[...]

Symptoms Sometimes not always when booting the phone things like following happen:

Other possibly related symptoms I see with k4.14:

How to reproduce Boot the phone, it seems to be helpful to not immediately connect the USB cable, but wait with that until the SIM resp. device PIN has been entered. It may take two or three boot attempts to actually reproduce the issue.

Additional context dmesg.log logcat.log

Found following files in /sys/fs/pstore:

-r--r-----  1 system log  176612 1970-02-10 01:10 console-ramoops-0
-r--r--r--  1 root   root  96991 2020-05-13 01:59 dmesg-ramoops-0
-r--r--r--  1 root   root  96901 2020-05-13 01:59 dmesg-ramoops-1
-r--r-----  1 system log  116488 1970-02-10 01:10 pmsg-ramoops-0

console-ramoops-0.log dmesg-ramoops-0.log dmesg-ramoops-1.log pmsg-ramoops-0.log

The files look similar to those I have already posted here some hours ago: https://github.com/sonyxperiadev/bug_tracker/issues/399#issuecomment-627102932

MarijnS95 commented 4 years ago

Err, isn't this exactly the same as https://github.com/sonyxperiadev/bug_tracker/issues/572?

All I can see is this, can you get a log and confirm whether it is the same issue with the missing file?

05-13 01:58:22.614   610   610 I chatty  : uid=1000(system) sensors@1.0-ser expire 103 lines
05-13 01:58:22.681   610   610 W qti_sensors_hal: sensors_get_sensors_list: handle 9 is not registered!
stefanhh0 commented 4 years ago

From the symptoms it feels the same as #572, but I can see the file on the phone:

lilac:/ $ ls -l /vendor/etc/sensors/sensors_settings
-rw-r--r-- 1 root root 2 2020-05-10 17:50 /vendor/etc/sensors/sensors_settings
lilac:/ $ cat /vendor/etc/sensors/sensors_settings
1
MarijnS95 commented 4 years ago

Looks like the sensor service restarts after some time:

05-13 01:58:28.312   758   758 E Sensors : sns_main.c(447):Timeout waiting for SMGR service. Exit sensors daemon!
05-13 01:58:28.312   758   758 W Sensors : sns_main.c(355):sns_main_exit called unlock daemon to exit
05-13 01:58:28.313   758   758 E Sensors : sns_main.c(561):Exiting! Err 0
05-13 01:58:28.336  2117  2117 I sensors.qcom: type=1400 audit(0.0:34): avc: denied { getattr } for path="/mnt/vendor/persist/sensors/sns.reg" dev="sda33" ino=40 scontext=u:r:sensors:s0 tcontext=u:object_r:unlabeled:s0 tclass=file permissive=1
05-13 01:58:28.336  2117  2117 I sensors.qcom: type=1400 audit(0.0:35): avc: denied { getattr } for path="/mnt/vendor/persist/sensors" dev="sda33" ino=12 scontext=u:r:sensors:s0 tcontext=u:object_r:unlabeled:s0 tclass=dir permissive=1
05-13 01:58:28.336  2117  2117 I sensors.qcom: type=1400 audit(0.0:36): avc: denied { read write } for name="sns.reg" dev="sda33" ino=40 scontext=u:r:sensors:s0 tcontext=u:object_r:unlabeled:s0 tclass=file permissive=1
05-13 01:58:28.337  2117  2117 I sensors.qcom: type=1400 audit(0.0:37): avc: denied { open } for path="/mnt/vendor/persist/sensors/sns.reg" dev="sda33" ino=40 scontext=u:r:sensors:s0 tcontext=u:object_r:unlabeled:s0 tclass=file permissive=1
05-13 01:58:28.341  2117  2117 I sensors.qcom: type=1400 audit(0.0:38): avc: denied { read } for name="sensors" dev="sda33" ino=12 scontext=u:r:sensors:s0 tcontext=u:object_r:unlabeled:s0 tclass=dir permissive=1
05-13 01:58:28.341  2117  2117 I sensors.qcom: type=1400 audit(0.0:39): avc: denied { open } for path="/mnt/vendor/persist/sensors" dev="sda33" ino=12 scontext=u:r:sensors:s0 tcontext=u:object_r:unlabeled:s0 tclass=dir permissive=1
05-13 01:58:28.341  2117  2117 I sensors.qcom: type=1400 audit(0.0:40): avc: denied { setattr } for name="sns.reg" dev="sda33" ino=40 scontext=u:r:sensors:s0 tcontext=u:object_r:unlabeled:s0 tclass=file permissive=1
05-13 01:58:28.343  2117  2120 E Sensors : sns_reg_la.c(121):read error, err:5, attempted:52, read:0
05-13 01:58:28.343  2117  2120 E Sensors : sns_reg.c(103):sns_reg_read failed; err 5
05-13 01:58:28.387  2117  2117 E Sensors : sns_debug_main.c(283):Debug Config File missing in EFS!
05-13 01:58:28.408  2117  2120 E QMI_FW  : xport_handle_event: QCSI Received Resume_Tx from 00000009:00000030 on FD- 4
05-13 01:58:28.410  2117  2120 E QMI_FW  : xport_handle_event: QCSI Received Resume_Tx from 00000009:00000031 on FD- 4
05-13 01:58:28.414  2117  2120 E QMI_FW  : xport_handle_event: QCSI Received Resume_Tx from 00000009:00000030 on FD- 4
05-13 01:58:28.417  2117  2120 E QMI_FW  : xport_handle_event: QCSI Received Resume_Tx from 00000009:00000031 on FD- 4
05-13 01:58:28.498   594  1411 E mm-camera: <STATS_AIS ><ERROR> 1844: dsps_sensor1_callback: Invalid Message Type
05-13 01:58:28.498   610  1360 W qti_sensors_hal: SMGRSensor_sensor1_cb: SENSOR1_MSG_TYPE_RETRY_OPEN

Just to be sure, can you re-capture the logs from boot so that I can validate this is also happening the first time around?

stefanhh0 commented 4 years ago

The logs that contain just the boot sequence until sim pin screen shows up:

dmesg.log logcat.log

MarijnS95 commented 4 years ago

Hmm it's still expired:

05-13 11:29:25.629   634   634 I chatty  : uid=1000(system) sensors@1.0-ser expire 94 lines

Are you running adb logcat before the device becomes available on USB? It works on all platforms I have here, and according to dmesg it initializes early enough, right before sensors is started. (Where, btw, I can see the same avc denials, so it's starting the same way, and likely also crashing failing in the same way. Has nothing to do with the permissive denials though :grin:)

stefanhh0 commented 4 years ago

With connected USB cable the behaviour is not that easy reproducible, thus I start adb logcat > logcat.log 2>&1, then I reboot the phone and after it is up and shows the SIM PIN screen i plugin the USB cable to retrieve the logs

MarijnS95 commented 4 years ago

The behaviour is likely identical to #572: Google rebooting the device to recovery because they think wiping it from there solves problems. I'll PR my local change to disable this, it causes nothing but unrelated trouble.

Please try again with logcat running and the cable attached right from the get-go; sensors will die regardless of that.

stefanhh0 commented 4 years ago

Here are the logs when USB cable was connected during boot and adb logcat was dumping the info during boot on my host.

dmesg.log logcat.log

I haven't seen any PR yet. So if I should pull in something / rebuild just let me know :-) Thanks, man!

stefanhh0 commented 4 years ago

I have deleted the file: /persist/sensors/sns.reg afterwards phone seems to start stable, here the logs:

dmesg.log logcat.log

stefanhh0 commented 4 years ago

After reboot, sns.reg has been regenarated but with a different content:

stefan@mars:~/android/pluto/lilac-bugs/phone_subsys_crasg/sensors$ diff sns.reg.old sns.reg.new 
Binärdateien sns.reg.old und sns.reg.new sind verschieden.
stefanhh0 commented 4 years ago

After another reboot with disconnected USB it is still stable. But with yet another reboot (USB disconnected) the issue re-occurs.

stefanhh0 commented 4 years ago

Made a fresh build: Kernel version: 4.14.176-g19852ba16481 Android version: android-10.0.0_r36 Software binaries version: SW_binaries_for_Xperia_Android_10.0.7.1_r1_v6_yoshino.img Version baseband: 1307-7511_47.2.A.11.228 Build: aosp_g8441-userdebug 10 QQ2A.200501.001.B3 eng.stefan.20200515.182739

Followed your instructions @MarijnS95:

lilac:/ # setprop persist.vendor.sensors.debug.ssc_qmi_debug true
lilac:/ # setprop persist.vendor.debug.sensors.trace true
lilac:/ # setprop persist.vendor.debug.sensors.hal v
lilac:/ # setprop persist.vendor.debug.sensors.daemon v
lilac:/ # getprop |grep persist.vendor
[persist.vendor.bluetooth.a4wp]: [true]
[persist.vendor.debug.sensors.daemon]: [v]
[persist.vendor.debug.sensors.hal]: [v]
[persist.vendor.debug.sensors.trace]: [true]
[persist.vendor.net.doxlat]: [true]
[persist.vendor.radio.adb_log_on]: [0]
[persist.vendor.radio.add_power_save]: [1]
[persist.vendor.radio.apm_sim_not_pwdn]: [1]
[persist.vendor.radio.eons.enabled]: [false]
[persist.vendor.radio.oem_socket]: [false]
[persist.vendor.radio.prop_migration_complete0]: [1]
[persist.vendor.radio.report_codec]: [1]
[persist.vendor.radio.ril_payload_on]: [0]
[persist.vendor.radio.sglte_target]: [0]
[persist.vendor.radio.vdp_on_ims_cap]: [1]
[persist.vendor.sensors.debug.ssc_qmi_debug]: [true]
[persist.vendor.timeadjust]: [1585871404]
lilac:/ # export pfolder=/persist/sensors
lilac:/ # rm -rf $pfolder
lilac:/ # mkdir -p $pfolder/registry/registry
lilac:/ # restorecon -RF $pfolder
SELinux: Loaded file_contexts
lilac:/ # chown system:system -R $pfolder
lilac:/ # reboot

The logs from the reboot: dmesg.log logcat.log

stefanhh0 commented 4 years ago

Since the phone boots have been still difficult after the restorecon(phone reboots even without party crasher), I gave it a try and recreated the sensors_settings via: lilac:/persist/sensors # echo 1 > sensors_settings

I have also unset the props and tried rebooting with and without USB connected around 6 times the phonesub system no longer crashed due to sensor error trouble.

However then on the "final" reboot I wanted to collect logs and that reboot had the issue again... Another question, what is the empty /persist/sensors/registry/registry for?

So here are the logs with sensor troubles: dmesg.log logcat.log

MarijnS95 commented 4 years ago

@stefanhh0 Those steps are to recreate sensor data. They are basically what would happen on a freshly flashed phone, when hardly anything exists: on stock scripts go to work to (needlessly) copy configuration files from /vendor/etc/sensors. The registry/registry folders are used to store that as well as calibration data, but not on every platform. Finally, a socket, version file, and list of available sensors is created.

Looks like Yoshino SSC blobs don't bite to those props; no extra debugging info is emitted, and instead it looks like it craps over sns.reg again:

1095:05-17 06:59:35.075   813   825 E Sensors : sns_reg_la.c(121):read error, err:5, attempted:52, read:0
1098:05-17 06:59:35.075   813   825 E Sensors : sns_reg.c(103):sns_reg_read failed; err 5
[...]
7310:05-17 06:59:45.317   813   813 E Sensors : sns_main.c(447):Timeout waiting for SMGR service. Exit sensors daemon!
7311:05-17 06:59:45.317   813   813 W Sensors : sns_main.c(355):sns_main_exit called unlock daemon to exit
7312:05-17 06:59:45.317   813   813 E Sensors : sns_main.c(561):Exiting! Err 0

Do you have pstore for those reboots? With RescueParty disabled it must be something else - something not immediately obvious from the logs - perhaps not initiated by anything android/dmesg can see.

stefanhh0 commented 4 years ago

Okay, then I assume yoshino saves no calibration data? Something you can't figure out because it is in the closed source blobs?

Too bad that we don't get extra debug, maybe @kholk has an idea how to pull out more infos from yoshino?

Not too difficult to get the files. However when it crashes, it always activates the vibrator twice, not sure if that means a double crash, no one ever confirmed it is double crashing as far as I remember, maybe not easy to say what it means? (see as well https://github.com/sonyxperiadev/bug_tracker/issues/399#issuecomment-627102932)

The files: console-ramoops-0.log dmesg-ramoops-0.log dmesg-ramoops-1.log pmsg-ramoops-0.log

Content wise these files look pretty similar to those I have already posted here, especially the gsi_stop_channel [...] ret_from_fork has been in the files several times. Well maybe not quite, have also seen a new one: dev_disable_lro [...] el0_svc_naked crash, is that helpful for you? Would be nice!

In https://github.com/sonyxperiadev/bug_tracker/files/4613409/dmesg-ramoops-0.log I can find another call trace: __local_bh_enable_ip [...] ret_from_fork that is not in the files posted in this issue.

Here a list of unique traces I have found in the files:

https://github.com/sonyxperiadev/bug_tracker/issues/580#issuecomment-630562703 (this comment) https://github.com/sonyxperiadev/bug_tracker/files/4647947/dmesg-ramoops-0.log

<6>[  162.952494] Call trace:
<6>[  162.952496]  dev_disable_lro+0x12c/0x15c
<6>[  162.952500]  devinet_sysctl_forward+0x14c/0x1ac
<6>[  162.952503]  proc_sys_call_handler+0x128/0x1b8
<6>[  162.952504]  proc_sys_write+0x10/0x18
<6>[  162.952507]  __vfs_write+0x44/0x134
<6>[  162.952508]  vfs_write+0xc8/0x184
<6>[  162.952510]  SyS_write+0x6c/0xcc
<6>[  162.952513]  el0_svc_naked+0x34/0x38
<4>[  162.952514] ---[ end trace 6ca7d8fe0f4d12e7 ]---

https://github.com/sonyxperiadev/bug_tracker/issues/580#issuecomment-630562703 (this comment) https://github.com/sonyxperiadev/bug_tracker/files/4647947/dmesg-ramoops-0.log

<6>[  204.427586] Call trace:
<6>[  204.427641]  gsi_stop_channel+0x478/0x498
<6>[  204.427697]  __ipa3_stop_gsi_channel+0xe8/0x328
<6>[  204.427798]  ipa3_stop_gsi_channel+0x1a8/0x354
<6>[  204.427854]  ipa3_teardown_sys_pipe+0x1e0/0x10f0
<6>[  204.427911]  ipa3_wwan_remove+0x98/0x540
<6>[  204.428020]  platform_drv_remove+0x24/0x4c
<6>[  204.428076]  device_release_driver_internal+0x130/0x1d4
<6>[  204.428189]  driver_detach+0x68/0xa4
<6>[  204.428248]  bus_remove_driver+0x74/0xa4
<6>[  204.428309]  driver_unregister+0x2c/0x4c
<6>[  204.428469]  platform_driver_unregister+0x10/0x18
<6>[  204.428648]  ipa3_ssr_notifier_cb+0x114/0x50c
<6>[  204.428817]  srcu_notifier_call_chain+0x5c/0xa0
<6>[  204.429113]  subsys_notif_queue_notification+0x28/0x30
<6>[  204.429279]  notify_each_subsys_device+0x268/0x3b0
<6>[  204.429575]  subsystem_restart_wq_func+0x214/0x704
<6>[  204.429743]  process_one_work+0x1fc/0x460
<6>[  204.429908]  worker_thread+0x2e8/0x44c
<6>[  204.430204]  kthread+0x118/0x128
<6>[  204.430367]  ret_from_fork+0x10/0x18
<0>[  204.430534] Code: 910b2442 52815d83 aa1303e4 97ca1a74 (d4210000) 
<4>[  204.430826] ---[ end trace 6ca7d8fe0f4d12eb ]---

https://github.com/sonyxperiadev/bug_tracker/issues/580#issue-617054076 (initial comment in this issue) https://github.com/sonyxperiadev/bug_tracker/files/4619162/dmesg-ramoops-0.log

<6>[  102.753512] Call trace:
<6>[  102.753515]  dev_watchdog+0x298/0x2b0
<6>[  102.753521]  call_timer_fn+0xc0/0x1b8
<6>[  102.753524]  run_timer_softirq+0x248/0x870
<6>[  102.753528]  __do_softirq+0x1ec/0x3a0
<6>[  102.753531]  irq_exit+0xf4/0x100
<6>[  102.753534]  handle_IPI+0x18c/0x2a8
<6>[  102.753535]  gic_handle_irq+0x128/0x1b8
<6>[  102.753539]  el1_irq+0xf0/0x194
<6>[  102.753541]  0xffffffc0f8567d20
<6>[  102.753547]  lpm_cpuidle_enter+0x4fc/0x538
<6>[  102.753549]  cpuidle_enter_state+0x1c0/0x338
<6>[  102.753551]  cpuidle_enter+0x18/0x20
<6>[  102.753555]  do_idle+0x1a0/0x288
<6>[  102.753557]  cpu_startup_entry+0x20/0x24
<6>[  102.753559]  secondary_start_kernel+0x114/0x120
<4>[  102.753561] ---[ end trace 56e63d8ba6777d29 ]---

https://github.com/sonyxperiadev/bug_tracker/issues/399#issuecomment-627102932 https://github.com/sonyxperiadev/bug_tracker/files/4613409/dmesg-ramoops-0.log

<6>[  219.744127] Call trace:
<6>[  219.744130]  __local_bh_enable_ip+0x98/0x108
<6>[  219.744133]  ipa3_send+0x68c/0xa40
<6>[  219.744135]  ipa3_tx_dp+0x80c/0x9d0
<6>[  219.744137]  ipa3_wwan_xmit+0x2d4/0x4a4
<6>[  219.744140]  dev_hard_start_xmit+0x144/0x294
<6>[  219.744143]  sch_direct_xmit+0x160/0x1a8
<6>[  219.744145]  __qdisc_run+0x3b4/0x4c0
<6>[  219.744148]  net_tx_action+0x218/0x244
<6>[  219.744150]  __do_softirq+0x1ec/0x3a0
<6>[  219.744151]  __local_bh_enable_ip+0xfc/0x108
<6>[  219.744153]  _raw_spin_unlock_bh+0x20/0x28
<6>[  219.744155]  ipa3_wake_tx_queue+0xd4/0xe0
<6>[  219.744161]  process_one_work+0x1fc/0x460
<6>[  219.744163]  worker_thread+0x2e8/0x44c
<6>[  219.744165]  kthread+0x118/0x128
<6>[  219.744168]  ret_from_fork+0x10/0x18
<4>[  219.744169] ---[ end trace 14829c9ffc0e9104 ]---

https://github.com/sonyxperiadev/bug_tracker/issues/399#issuecomment-627102932 https://github.com/sonyxperiadev/bug_tracker/files/4613409/dmesg-ramoops-0.log

<6>[  246.588872] Call trace:
<6>[  246.588944]  gsi_stop_channel+0x478/0x498
<6>[  246.589017]  __ipa3_stop_gsi_channel+0xe8/0x328
<6>[  246.589144]  ipa3_stop_gsi_channel+0x1a8/0x354
<6>[  246.589215]  ipa3_teardown_sys_pipe+0x1e0/0x10f0
<6>[  246.589288]  ipa3_wwan_remove+0x98/0x540
<6>[  246.589416]  platform_drv_remove+0x24/0x4c
<6>[  246.589486]  device_release_driver_internal+0x130/0x1d4
<6>[  246.589612]  driver_detach+0x68/0xa4
<6>[  246.589682]  bus_remove_driver+0x74/0xa4
<6>[  246.589752]  driver_unregister+0x2c/0x4c
<6>[  246.589878]  platform_driver_unregister+0x10/0x18
<6>[  246.589949]  ipa3_ssr_notifier_cb+0x114/0x50c
<6>[  246.590023]  srcu_notifier_call_chain+0x5c/0xa0
<6>[  246.590150]  subsys_notif_queue_notification+0x28/0x30
<6>[  246.590218]  notify_each_subsys_device+0x268/0x3b0
<6>[  246.590342]  subsystem_put+0xac/0x288
<6>[  246.590412]  subsys_device_close+0x78/0x9c
<6>[  246.590484]  __fput+0xbc/0x1b8
<6>[  246.590610]  ____fput+0xc/0x14
<6>[  246.590680]  task_work_run+0x34/0xa8
<6>[  246.590751]  do_notify_resume+0x78/0x1b88
<6>[  246.590876]  work_pending+0x8/0x10
<0>[  246.590949] Code: 910b2442 52815d83 aa1303e4 97ca1a74 (d4210000) 
<4>[  246.591018] ---[ end trace 14829c9ffc0e9106 ]---
stefanhh0 commented 4 years ago

Some more unique call traces: https://github.com/sonyxperiadev/bug_tracker/issues/402#issuecomment-630585732 https://github.com/sonyxperiadev/bug_tracker/files/4648341/dmesg-ramoops-0.log

<6>[  107.107814] Call trace:
<6>[  107.107818]  cpufreq_interactive_update+0x6f0/0x724
<6>[  107.107821]  cpufreq_interactive_idle_notifier+0x70/0x88
<6>[  107.107826]  atomic_notifier_call_chain+0x4c/0x84
<6>[  107.107831]  idle_notifier_call_chain+0x1c/0x24
<6>[  107.107834]  arch_cpu_idle_exit+0x10/0x18
<6>[  107.107838]  do_idle+0x238/0x288
<6>[  107.107841]  cpu_startup_entry+0x20/0x24
<6>[  107.107845]  rest_init+0xd0/0xdc
<6>[  107.107850]  start_kernel+0x3f4/0x464
<4>[  107.107852] ---[ end trace 28f6ceda08526004 ]---

https://github.com/sonyxperiadev/bug_tracker/issues/402#issuecomment-630585732
https://github.com/sonyxperiadev/bug_tracker/files/4648341/dmesg-ramoops-0.log
<6>[  151.712419] Call trace:
<6>[  151.712422]  dev_watchdog+0x298/0x2b0
<6>[  151.712426]  call_timer_fn+0xc0/0x1b8
<6>[  151.712428]  run_timer_softirq+0x248/0x870
<6>[  151.712430]  __do_softirq+0x1ec/0x3a0
<6>[  151.712433]  irq_exit+0xf4/0x100
<6>[  151.712435]  __handle_domain_irq+0x84/0xbc
<6>[  151.712437]  gic_handle_irq+0x168/0x1b8
<6>[  151.712438]  el1_irq+0xf0/0x194
<6>[  151.712439]  0xffffffc091f87d20
<6>[  151.712443]  SyS_epoll_pwait+0xc/0x16c
<4>[  151.712444] ---[ end trace 28f6ceda08526005 ]---
kholk commented 4 years ago

We're getting modem panic. Perfect. Totally sick.

Okay, I may have an idea or two... I will try to reproduce this issue ASAP.

MarijnS95 commented 4 years ago

Okay, then I assume yoshino saves no calibration data? Something you can't figure out because it is in the closed source blobs?

It depends on the platform as far as I'm aware, and there's no harm whether it does or doesn't store intermittent data in /persist.

Too bad that we don't get extra debug, maybe @kholk has an idea how to pull out more infos from yoshino?

strings can usually tell you all the properties an application listens for, but I assumed they were all on a similar libssc.so base (where IIRC most of the stuff originates from). I'm not close to Yoshino blobs right now so can't check.

Can you provide some statistics on the sns.reg file? I would like to know size, if file can tell us something interesting, perhaps a hexdump of the top. On a working Ganges, which uses the same stack iirc (or at least much more similar than Tama and beyond) it's random gobbledygook (eg. custom format). The thing might have just corrupted it while writing, evident after a reboot.

Not too difficult to get the files. However when it crashes, it always activates the vibrator twice, not sure if that means a double crash, no one ever confirmed it is double crashing as far as I remember, maybe not easy to say what it means?

Not that I remember, afaik it always double-buzzes for me. A double crash would be where the display at least goes past the bootloader, into the kernel (seeing the "your device is unlocked" warning) before restarting again.

Content wise these files look pretty similar to those I have already posted here, especially the gsi_stop_channel [...] ret_from_fork has been in the files several times. Well maybe not quite, have also seen a new one: dev_disable_lro [...] el0_svc_naked crash, is that helpful for you? Would be nice!

WARNING traces may or may not be interesting (bad drivers mucking about), the BUGs are always interesting as they panic the device, such as the last trace seen in a dmesg from pstore:

<2>[  204.424135] kernel BUG at /home/stefan/android/source/kernel/sony/msm-4.14/kernel/drivers/platform/msm/gsi/gsi.c:2798!

Which is the gsi_stop_channel one. In this case, context before might even be more important:

<3>[  203.405614] Fatal error on the modem.

Which is exactly what @kholk aready uncovered :grin: Unfortunately that's his area of expertise (as is 99.9999% of the kernel) so I can't really help out here. Good to know that sensors aren't the only cause, but it would be stupid if a crash causes sensors to corrupt its file repeatedly; that doesn't generally happen to any other files.

That said certain warnings or other log messages may be the paper trail to a bug in case we can't immediately work out what it is. That's why seeing the entire log, or at least the trace with a bunch of lines before it is vitally important :) As for standalone WARNs, we may have a thing or two to fix.

stefanhh0 commented 4 years ago

Thanks a lot for your in-depth explanation it helps me understanding it all a lot better :-)

Here the requested infos, regarding sns.reg: sns.reg.old refers to the file that I have pulled from the device before the deletion. sns.reg.new refers to the file after the first boot directly after deletion. I have used the very same files as in https://github.com/sonyxperiadev/bug_tracker/issues/580#issuecomment-629354037

stefan@mars:~/android/pluto/lilac-bugs/phone_subsys_crash/sensors$ file sns.reg.old 
sns.reg.old: data
stefan@mars:~/android/pluto/lilac-bugs/phone_subsys_crash/sensors$ file sns.reg.new 
sns.reg.new: data
hexdump diff

``` stefan@mars:~/android/pluto/lilac-bugs/phone_subsys_crash/sensors$ hexdump sns.reg.old> sns.reg.old.hex stefan@mars:~/android/pluto/lilac-bugs/phone_subsys_crash/sensors$ hexdump sns.reg.new> sns.reg.new.hex stefan@mars:~/android/pluto/lilac-bugs/phone_subsys_crash/sensors$ diff sns.reg.old.hex sns.reg.new.hex 1c1,3 < 0000000 0000 0000 0000 0000 0000 0000 0000 0000 --- > 0000000 0000 0000 0000 0000 0000 0000 0000 0001 > 0000010 0000 0001 0000 0001 0000 0000 0000 0000 > 0000020 0000 0000 0000 0000 0000 0000 0000 0000 10a13,15 > 0000300 0000 0019 10db 0000 0000 0001 0000 0000 > 0000310 0000 0000 0000 0000 0000 0000 0000 0000 > * 15c20,21 < 0000510 0000 0100 0000 0000 0000 0000 0000 0000 --- > 0000500 0000 0002 0000 0000 0000 000a 0020 0000 > 0000510 0001 0100 0000 0000 0000 0000 0000 0000 18c24,25 < 0000610 0000 0000 0000 0000 0a00 0000 0000 0000 --- > 0000600 0000 0000 0000 0000 0000 0000 0000 0001 > 0000610 0000 0001 0000 0001 0a00 0000 0000 0000 32a40,46 > 0000700 0000 0000 0000 0000 0000 0000 0000 0001 > 0000710 0000 0001 0000 0001 0000 0000 0000 0000 > 0000720 0000 0000 0000 0000 0000 0000 0000 0000 > * > 0000800 0000 0000 0000 0000 0000 0000 0000 0001 > 0000810 0000 0001 0000 0001 0000 0000 0000 0000 > 0000820 0000 0000 0000 0000 0000 0000 0000 0000 36a51,53 > 0000b00 0201 0003 0000 0000 0000 0000 0000 0000 > 0000b10 0000 0000 0000 0000 0000 0000 0000 0000 > * 42a60,61 > 0000e00 0bb8 0000 05dc 0000 05dc 0000 0bb8 0000 > 0000e10 01f4 0000 0000 3f80 126f 3a83 0000 4110 46c65,66 < 0000f10 0000 0000 0000 000f 0000 0000 0000 0000 --- > 0000f00 0fa0 0000 0258 0000 0258 0000 0fa0 0000 > 0000f10 d70a 3e23 0000 000f 0000 0000 0000 0000 49c69 < 0001000 0000 0000 0a08 0000 000f 0000 0000 0000 --- > 0001000 012c 0000 0a08 0000 000f 0000 0000 0000 52c72 < 0001100 0000 0000 0000 0000 1f21 3c74 0000 0000 --- > 0001100 da40 387f 0000 3e80 1f21 3c74 0000 0000 54a75,83 > 0001200 0001 0001 ffff ffff 0000 0000 0000 0000 > 0001210 0000 0000 0000 0000 0000 0000 0000 0000 > * > 0001230 0000 0000 0000 0000 0000 0000 ffff ffff > 0001240 0000 0000 0000 0000 0000 0000 0000 0000 > * > 0001270 0000 0000 ffff ffff 0000 0000 0000 0000 > 0001280 0000 0000 0000 0000 0000 0000 0000 0000 > * 273c302,320 < 0006210 0000 0000 0000 0000 0000 0000 0000 0000 --- > 0006210 1f00 4d45 2d00 0000 0100 3333 0001 3852 > 0006220 0002 199a 0000 4ccd 0000 4ccd 0000 9b23 > 0006230 fffc 712d 0000 04ca 0000 ba99 0002 f8ef > 0006240 fffe b4ea 0000 23b0 0002 0000 0007 6666 > 0006250 0001 000b 0000 0000 0009 0000 000b 0000 > 0006260 0007 0000 0009 0000 0005 8000 0007 0000 > 0006270 0004 0000 0006 0000 0003 0000 0005 0000 > 0006280 0002 0000 0004 6666 0001 4ccd 0003 0000 > 0006290 0001 cccd 0002 4000 0000 3333 0002 199a > 00062a0 0000 5eb8 0001 0000 0000 2666 0001 0000 > 00062b0 0000 0000 0000 0000 0000 0000 0000 0000 > * > 0006300 3fe1 0000 0013 0000 4000 0000 147b 0000 > 0006310 3333 0000 0002 0000 0003 0000 01a0 0000 > 0006320 016b 0000 0003 0000 0001 0000 0005 0000 > 0006330 0005 0000 0001 0000 0003 0000 0341 0000 > 0006340 0000 0000 0000 0000 0000 0000 0000 0000 > 0006350 00b5 0000 0000 0000 0000 0000 0000 0000 > 0006360 0000 0000 0000 0000 0000 0000 0000 0000 275,276c322 < 0006300 3fe1 0000 0013 0000 < 0006308 --- > 000637c ```

If you like to check something additionally, I have just attached the files here: sns.reg.new.bin.txt sns.reg.new.hex.txt sns.reg.old.bin.txt sns.reg.old.hex.txt

MarijnS95 commented 4 years ago

@stefanhh0 Unfortunately I can't do much with the files without knowing the entire format. The sizes are identical, meaning it reads something (or has broken code in general) that makes it seek to a wrong position in the file, failing a read as a consequence. Why that happens, I have no idea. It is unlikely that the file got corrupted due to the reboots, so it might be the software itself writing something wrong or reading (interpreting) something wrong.

Would you mind putting the original sns.reg file back in place, then:

setprop ctl.restart vendor.sensors

If that doesn't change anything, try this:

stop
sleep 1
setprop ctl.restart vendor.sensors vendor.sensors-hal-1-0
sleep 1
start

Let me know if that gets it going again.

kholk commented 4 years ago

I don't think that this is a gsi driver issue, nor a sensors issue.. What you see should be just the tip of the iceberg, as from what it appears in one of the logs that you posted, the modem itself gets rebooted because of a process (on the modem RTOS, not on the hlos) locking up.

That sensors issue that you see is probably an effect of a complete dirty panic tear down of all of the main remote processors (subsystems if you like).

MarijnS95 commented 4 years ago

@kholk Yeah on the one hand I hope the spontaneous reboots cause sensors to corrupt its file, but we reset our phones daily while devving and I have never ran into this prior. There's likely more going on inside the sensor blob, too. Seeking and writing all over the place has to be done with a bit of care.

Let's postpone debugging it though until the modem RTOS behaves again.

stefanhh0 commented 4 years ago

@MarijnS95 so I'll postpone what you have suggested in https://github.com/sonyxperiadev/bug_tracker/issues/580#issuecomment-631329356, correct? Or do you want me still do it now?

What do you mean with original sns.reg file exactly, the one I saved as sns.reg.old that may have been corrupted or just the one that is recreated/restored after deletion(sns.reg.new)?

Just downloaded the current file and diffed it:

stefan@mars:~/android/pluto/lilac-bugs/phone_subsys_crash/sensors$ diff sns.reg.current sns.reg.old
stefan@mars:~/android/pluto/lilac-bugs/phone_subsys_crash/sensors$ 
# No difference!

So you want me to restore the one that is recreated/resored when I delete the file? that would be sns.reg.new

It would be interesting that the file sns.reg.new is exactly corrupted in the same way as it has happened with sns.reg.old before? Which then again seems to be the case since sns.reg.current should be the very same file as sns.reg.new since I never restored sns.reg.old on my phone?!

Or can it be that it gets updated at some point which then wouldn't be a corruption? Maybe wrongly updated on second reboot after it has been initially created? Just guessing without any knowledge I just have the feeling I am missing here some pieces of the puzzle...

MarijnS95 commented 4 years ago

@stefanhh0 Ah right, there are currently only broken files around. We can wait until device restarts are over, or manually stop/start the service and reboot (which may be hard considering that there is a window of 60-80s before modem pulls the entire device down).

If you remove the file, reboot the device, see working sensors, then manually reboot again, do sensors still work? It might just corrupt the file regardless. You can also test this with the second block of commands from https://github.com/sonyxperiadev/bug_tracker/issues/580#issuecomment-631329356 instead of a reboot.

eMPee584 commented 4 years ago

just some quick notes regarding tools for grokking/diffing binary blobs:

bonus tools

regarding the two posted versions of sns.reg file (https://github.com/sonyxperiadev/bug_tracker/issues/580#issuecomment-631221338): binwalk does not find included files, the files mostly match with differences throughout and esp. some appended data at the end. Found out nothing about format or content though, sorry. 😅

MarijnS95 commented 4 years ago

@eMPee584 Thanks, I guess. This registry file is just a storage spot for "sensor related data" and should contain no executable code whatsoever. The only way to drill down properly is using more sophisticated tools to get an unambiguous representation of the format - and what exactly causes it to seek to and read from an invalid location. Chances are this is an issue inside one of the executables/libraries themselves which needs to be found using those tools regardless. Being able to pin-point the issue makes it much easier to forward and get resolved.

We'll do this in due time, after confirming a panic and/or hypervisor crash does not cause magic corruption (after all the binary reads and writes all across the file - updating segments in the wrong order is asking for trouble).

stefanhh0 commented 4 years ago

Kernel version: 4.14.176-g81f30b888640¹ Android version: android-10.0.0_r36 Software binaries version: SW_binaries_for_Xperia_Android_10.0.7.1_r1_v7a_yoshino.img Version baseband: 1307-7511_47.2.A.11.228 Build: aosp_g8441-userdebug 10 QQ2A.200501.001.B3 eng.stefan.20200524.123100

Description @kholk tried with your branch, I can see that the modem is still having trouble and phone also rebooted, see attached files.

<2>[ 254.487250] kernel BUG at /home/stefan/android/source/kernel/sony/msm-4.14/kernel/drivers/iommu/arm-smmu.c:1598!

Further, building the kernel showed warnings:

build_log_lilac warnings

``` /home/stefan/android/source/kernel/sony/msm-4.14/kernel/arch/arm64/boot/dts/qcom/msm8998-audio.dtsi:163.3-166.26: Warning (reg_format): /soc/qcom,msm-audio-apr/qcom,q6core-audio/sound-9335:reg: property has invalid length (32 bytes) (#address-cells == 2, #size-cells == 1) /home/stefan/android/source/kernel/sony/msm-4.14/kernel/arch/arm64/boot/dts/qcom/msm8998-audio.dtsi:280.3-283.26: Warning (reg_format): /soc/qcom,msm-audio-apr/qcom,q6core-audio/sound-tavil:reg: property has invalid length (32 bytes) (#address-cells == 2, #size-cells == 1) arch/arm64/boot/dts/qcom/msm8998-v2-yoshino-lilac-sde_generic.dtb: Warning (i2c_bus_reg): Failed prerequisite 'reg_format' arch/arm64/boot/dts/qcom/msm8998-v2-yoshino-lilac-sde_generic.dtb: Warning (spi_bus_reg): Failed prerequisite 'reg_format' /home/stefan/android/source/kernel/sony/msm-4.14/kernel/arch/arm64/boot/dts/qcom/msm8998-audio.dtsi:154.23-269.4: Warning (avoid_default_addr_size): /soc/qcom,msm-audio-apr/qcom,q6core-audio/sound-9335: Relying on default #address-cells value also defined at /home/stefan/android/source/kernel/sony/msm-4.14/kernel/arch/arm64/boot/dts/qcom/msm8998-mtp.dtsi:534.11-536.3 also defined at /home/stefan/android/source/kernel/sony/msm-4.14/kernel/arch/arm64/boot/dts/qcom/msm8998-yoshino-common.dtsi:594.11-615.3 /home/stefan/android/source/kernel/sony/msm-4.14/kernel/arch/arm64/boot/dts/qcom/msm8998-audio.dtsi:154.23-269.4: Warning (avoid_default_addr_size): /soc/qcom,msm-audio-apr/qcom,q6core-audio/sound-9335: Relying on default #size-cells value also defined at /home/stefan/android/source/kernel/sony/msm-4.14/kernel/arch/arm64/boot/dts/qcom/msm8998-mtp.dtsi:534.11-536.3 also defined at /home/stefan/android/source/kernel/sony/msm-4.14/kernel/arch/arm64/boot/dts/qcom/msm8998-yoshino-common.dtsi:594.11-615.3 /home/stefan/android/source/kernel/sony/msm-4.14/kernel/arch/arm64/boot/dts/qcom/msm8998-audio.dtsi:271.24-388.4: Warning (avoid_default_addr_size): /soc/qcom,msm-audio-apr/qcom,q6core-audio/sound-tavil: Relying on default #address-cells value /home/stefan/android/source/kernel/sony/msm-4.14/kernel/arch/arm64/boot/dts/qcom/msm8998-audio.dtsi:271.24-388.4: Warning (avoid_default_addr_size): /soc/qcom,msm-audio-apr/qcom,q6core-audio/sound-tavil: Relying on default #size-cells value arch/arm64/boot/dts/qcom/msm8998-v2-yoshino-lilac-sde_generic.dtb: Warning (avoid_unnecessary_addr_size): Failed prerequisite 'avoid_default_addr_size' /home/stefan/android/source/kernel/sony/msm-4.14/kernel/arch/arm64/boot/dts/qcom/msm8998-audio.dtsi:163.3-166.26: Warning (reg_format): /soc/qcom,msm-audio-apr/qcom,q6core-audio/sound-9335:reg: property has invalid length (32 bytes) (#address-cells == 2, #size-cells == 1) /home/stefan/android/source/kernel/sony/msm-4.14/kernel/arch/arm64/boot/dts/qcom/msm8998-audio.dtsi:280.3-283.26: Warning (reg_format): /soc/qcom,msm-audio-apr/qcom,q6core-audio/sound-tavil:reg: property has invalid length (32 bytes) (#address-cells == 2, #size-cells == 1) arch/arm64/boot/dts/qcom/msm8998-v2.1-yoshino-lilac-sde_generic.dtb: Warning (i2c_bus_reg): Failed prerequisite 'reg_format' arch/arm64/boot/dts/qcom/msm8998-v2.1-yoshino-lilac-sde_generic.dtb: Warning (spi_bus_reg): Failed prerequisite 'reg_format' /home/stefan/android/source/kernel/sony/msm-4.14/kernel/arch/arm64/boot/dts/qcom/msm8998-audio.dtsi:154.23-269.4: Warning (avoid_default_addr_size): /soc/qcom,msm-audio-apr/qcom,q6core-audio/sound-9335: Relying on default #address-cells value also defined at /home/stefan/android/source/kernel/sony/msm-4.14/kernel/arch/arm64/boot/dts/qcom/msm8998-mtp.dtsi:534.11-536.3 also defined at /home/stefan/android/source/kernel/sony/msm-4.14/kernel/arch/arm64/boot/dts/qcom/msm8998-yoshino-common.dtsi:594.11-615.3 /home/stefan/android/source/kernel/sony/msm-4.14/kernel/arch/arm64/boot/dts/qcom/msm8998-audio.dtsi:154.23-269.4: Warning (avoid_default_addr_size): /soc/qcom,msm-audio-apr/qcom,q6core-audio/sound-9335: Relying on default #size-cells value also defined at /home/stefan/android/source/kernel/sony/msm-4.14/kernel/arch/arm64/boot/dts/qcom/msm8998-mtp.dtsi:534.11-536.3 also defined at /home/stefan/android/source/kernel/sony/msm-4.14/kernel/arch/arm64/boot/dts/qcom/msm8998-yoshino-common.dtsi:594.11-615.3 /home/stefan/android/source/kernel/sony/msm-4.14/kernel/arch/arm64/boot/dts/qcom/msm8998-audio.dtsi:271.24-388.4: Warning (avoid_default_addr_size): /soc/qcom,msm-audio-apr/qcom,q6core-audio/sound-tavil: Relying on default #address-cells value /home/stefan/android/source/kernel/sony/msm-4.14/kernel/arch/arm64/boot/dts/qcom/msm8998-audio.dtsi:271.24-388.4: Warning (avoid_default_addr_size): /soc/qcom,msm-audio-apr/qcom,q6core-audio/sound-tavil: Relying on default #size-cells value arch/arm64/boot/dts/qcom/msm8998-v2.1-yoshino-lilac-sde_generic.dtb: Warning (avoid_unnecessary_addr_size): Failed prerequisite 'avoid_default_addr_size' ```

Additional context dmesg.log logcat.log

/sys/fs/pstore contents after reboot: console-ramoops-0.log dmesg-ramoops-0.log dmesg-ramoops-1.log pmsg-ramoops-0.log


¹) 81f30b888640 is not a commit that can be found on github it has been created via rebasing: https://github.com/kholk/kernel/commit/6e011ce687959bf14fe8ebb9d05727c6148f0955 onto: https://github.com/sonyxperiadev/kernel/commit/260d812922ce88b1f105db8046053d17142467e8

stefanhh0 commented 4 years ago

Deleted: /persist/sensors/sns.reg then rebooted modem came up stable as far as I could see, but shortly after another crash again:

console-ramoops-0.log dmesg-ramoops-0.log dmesg-ramoops-1.log pmsg-ramoops-0.log

stefanhh0 commented 4 years ago

Kernel version: 4.14.176-g5bd3bd453185¹ Android version: android-10.0.0_r36 Software binaries version: SW_binaries_for_Xperia_Android_10.0.7.1_r1_v7a_yoshino.img Version baseband: 1307-7511_47.2.A.11.228 Build: aosp_g8441-userdebug 10 QQ2A.200501.001.B3 eng.stefan.20200524.123100

Description With following manual changes in the kernel dts:

stefan@mars:~/android/source/kernel/sony/msm-4.14/kernel$ git diff
diff --git a/arch/arm64/boot/dts/qcom/dsi-panel-somc-lilac-sde.dtsi b/arch/arm64/boot/dts/qcom/dsi-panel-somc-lilac-sde.dtsi
index 6f6b87101e8a..ee90ad614416 100644
--- a/arch/arm64/boot/dts/qcom/dsi-panel-somc-lilac-sde.dtsi
+++ b/arch/arm64/boot/dts/qcom/dsi-panel-somc-lilac-sde.dtsi
@@ -71,3 +71,20 @@
        /* Possible panels */
        somc,dsi-panel-list = <&sde_dsi_5 &sde_dsi_8>;
 };
+&sde_dsi0 {
+       /delete-property/ qcom,cont-splash-enabled;
+};
+&mdss_mdp {
+       /delete-property/ compatible;
+       /delete-property/ reg;
+       /delete-property/ reg-names;
+};
+&mdss_dsi {
+       /delete-property/ compatible;
+};
+&mdss_dsi0 {
+       /delete-property/ label;
+};
+&mdss_dsi1 {
+       /delete-property/ label;
+};
\ No newline at end of file

The issue #578 seems to be fixed. However it still crashes always when entering the SIM PIN, in the logs sensors trouble still showing up (Sensor is null).

Also following crash can still be found:

<2>[  872.965689] kernel BUG at /home/stefan/android/source/kernel/sony/msm-4.14/kernel/drivers/iommu/arm-smmu.c:1598!
[...]
<6>[  872.968692] Call trace:
<6>[  872.968778]  arm_smmu_context_fault+0x588/0x60c
<6>[  872.968828]  irq_thread_fn+0x2c/0x70
<6>[  872.968875]  irq_thread+0x13c/0x230
<6>[  872.968962]  kthread+0x118/0x128
<6>[  872.969010]  ret_from_fork+0x10/0x18
<0>[  872.969060] Code: f9400320 d000a8c1 911d4021 940640d9 (d4210000) 

The compiler warnings seen before have been fixed.

Additional context Logs from boot: dmesg.log logcat.log

console-ramoops-0.log dmesg-ramoops-0.log dmesg-ramoops-1.log pmsg-ramoops-0.log


¹) 5bd3bd453185 is not a commit that can be found on github it has been created via rebasing: https://github.com/kholk/kernel/commit/49320473843a5f9de7bff5344bb8612aef6ff4c2 onto: https://github.com/sonyxperiadev/kernel/commit/260d812922ce88b1f105db8046053d17142467e8

stefanhh0 commented 4 years ago

Kernel version: 4.14.176-gef7fd1a4bb9f¹ Android version: android-10.0.0_r36 Software binaries version: SW_binaries_for_Xperia_Android_10.0.7.1_r1_v7a_yoshino.img Version baseband: 1307-7511_47.2.A.11.228 Build: aosp_g8441-userdebug 10 QQ2A.200501.001.B3 eng.stefan.20200524.123100

Description After cherry-picking: https://github.com/sonyxperiadev/kernel/pull/2268 several problems have been mitigated and solved but the basic sensors trouble and modem being shaky has not been solved fully.

See as well attached logs, also a crash occurred.

<2>[  125.184411] kernel BUG at /home/stefan/android/source/kernel/sony/msm-4.14/kernel/lib/genalloc.c:392!
<6>[  125.188111] Call trace:
<6>[  125.188170]  gen_pool_free+0x1d4/0x1dc
<6>[  125.188282]  __dma_free+0x74/0xbc
<6>[  125.188345]  ipa3_teardown_sys_pipe+0x948/0x10f0
<6>[  125.188406]  ipa3_wwan_remove+0x98/0x540
<6>[  125.188517]  platform_drv_remove+0x24/0x4c
<6>[  125.188579]  device_release_driver_internal+0x130/0x1d4
<6>[  125.188689]  driver_detach+0x68/0xa4
<6>[  125.188748]  bus_remove_driver+0x74/0xa4
<6>[  125.188807]  driver_unregister+0x2c/0x4c
<6>[  125.188921]  platform_driver_unregister+0x10/0x18
<6>[  125.188980]  ipa3_ssr_notifier_cb+0x114/0x50c
<6>[  125.189044]  srcu_notifier_call_chain+0x5c/0xa0
<6>[  125.189154]  subsys_notif_queue_notification+0x28/0x30
<6>[  125.189213]  notify_each_subsys_device+0x268/0x3b0
<6>[  125.189326]  subsystem_restart_wq_func+0x214/0x704
<6>[  125.189390]  process_one_work+0x1fc/0x460
<6>[  125.189451]  worker_thread+0x2e8/0x44c
<6>[  125.189560]  kthread+0x118/0x128
<6>[  125.189621]  ret_from_fork+0x10/0x18
<0>[  125.189682] Code: d4210000 14000000 d4210000 14000000 (d4210000) 

Additional context dmesg.log logcat.log

/sys/fs/pstore: console-ramoops-0.log dmesg-ramoops-0.log dmesg-ramoops-1.log pmsg-ramoops-0.log


¹) ef7fd1a4bb9f is not a commit that can be found on github it has been created via cherry-picking 86 commits from: https://github.com/sonyxperiadev/kernel/pull/2268 onto: https://github.com/sonyxperiadev/kernel/commit/260d812922ce88b1f105db8046053d17142467e8

stefanhh0 commented 4 years ago

Kernel version: 4.14.184-24205-g2584abc5fa70¹ Android version: android-10.0.0_r39 Software binaries version: SW_binaries_for_Xperia_Android_10.0.7.1_r1_v8a_yoshino.img Version baseband: 1307-7511_47.2.A.11.228 Build: aosp_g8441-userdebug 10 QQ3A.200605.002.A1 eng.stefan.20200614.061225²

Description Another crash occurred, shaky modem, then after some time the phone crashed. Since I found something I haven't seen before I am reporting it:

Maybe important: I have forgotten to call the command, restorecon -RF /persist after having flashed (meanwhile done it).

console-ramoops-0.log:

[  321.959246] Fatal error on the modem.
[  321.959333] modem subsystem failure reason: dog_hb.c:266:DOG_HB detects starvation of task 0xbced5ee0, triage with its own.
[  322.720067] ------------[ cut here ]------------
[  322.720181] WARNING: CPU: 7 PID: 1977 at /home/stefan/android/source/kernel/sony/msm-4.14/kernel/net/sched/sch_generic.c:320 dev_watchdog+0x298/0x2b0
[  322.720457] ---[ end trace d3d09e80c69966a8 ]---
[  322.976316] gsi soc:qcom,msm_gsi: gsi_stop_channel:2796 chan=8 unexpected state=0
[  322.976820] gsi soc:qcom,msm_gsi: gsi_stop_channel:2802 MSM8998 workaround for gsi firmware: avoiding kernel panic for already not allocated CH.
[  322.979182] gsi soc:qcom,msm_gsi: gsi_reset_channel:2916 bad state 0
[  322.979305] gsi soc:qcom,msm_gsi: gsi_reset_channel:2921 MSM8998 workaround for gsi firmware: avoiding kernel panic for already not allocated CH.
[  322.979435] gsi soc:qcom,msm_gsi: gsi_dealloc_channel:3011 bad state 0
[  322.979503] gsi soc:qcom,msm_gsi: gsi_dealloc_channel:3016 MSM8998 workaround for gsi firmware: avoiding kernel panic for already not allocated CH.
[  322.979742] gsi soc:qcom,msm_gsi: gsi_dealloc_evt_ring:1729 1 channels still using this event ring
[  322.979867] ------------[ cut here ]------------
[  322.979990] WARNING: CPU: 2 PID: 540 at /home/stefan/android/source/kernel/sony/msm-4.14/kernel/drivers/platform/msm/ipa/ipa_v3/ipa_dp.c:1446 ipa3_teardown_sys_pipe+0x10e4/0x10f0
[  322.980541] ---[ end trace d3d09e80c69966a9 ]---
[  322.980687] ipa-wan ipa3_wwan_remove:2654 Failed to teardown IPA->APPS pipe
[  323.984298] gsi soc:qcom,msm_gsi: gsi_stop_channel:2796 chan=5 unexpected state=0
[  323.984402] gsi soc:qcom,msm_gsi: gsi_stop_channel:2802 MSM8998 workaround for gsi firmware: avoiding kernel panic for already not allocated CH.
[  323.986086] gsi soc:qcom,msm_gsi: gsi_reset_channel:2916 bad state 0
[  323.986182] gsi soc:qcom,msm_gsi: gsi_reset_channel:2921 MSM8998 workaround for gsi firmware: avoiding kernel panic for already not allocated CH.
[  323.986281] gsi soc:qcom,msm_gsi: gsi_dealloc_channel:3011 bad state 0
[  323.986332] gsi soc:qcom,msm_gsi: gsi_dealloc_channel:3016 MSM8998 workaround for gsi firmware: avoiding kernel panic for already not allocated CH.
[  323.986619] ipa-wan __ipa_wwan_close:1049 [rmnet_ipa0]: ipa3_deregister_intf failed -22
[  323.988924] [RMNET:HI] rmnet_config_notify_cb(): Kernel is trying to unregister rmnet_ipa0
[  323.990339] [RMNET:HI] rmnet_config_notify_cb(): Kernel is trying to unregister rmnet_data0
[  324.002499] [RMNET:HI] rmnet_config_notify_cb(): Kernel is trying to unregister rmnet_ipa0
[  324.002947] ipa-wan ipa3_del_dflt_wan_rt_tables:437 Deleting Route hdl:(0x7) with ip type: 0
[  324.005051] ipa-wan ipa3_del_dflt_wan_rt_tables:446 Deleting Route hdl:(0x9) with ip type: 1
[...]
[  324.082275] [RMNET:HI] rmnet_config_notify_cb(): Kernel is trying to unregister rmnet_data7
[  324.093593] [RMNET:HI] rmnet_config_notify_cb(): Kernel is trying to unregister rmnet_data7
[  324.110523] pil-q6v5-mss 4080000.qcom,mss: Port 0000000000000000 halt timeout
[  324.210718] pil-q6v5-mss 4080000.qcom,mss: Port 0000000000000000 halt timeout
[  324.212198] ------------[ cut here ]------------
[  324.212309] WARNING: CPU: 3 PID: 540 at /home/stefan/android/source/kernel/sony/msm-4.14/kernel/drivers/clk/qcom/clk-branch.c:116 clk_branch_toggle+0x13c/0x184
[  324.212748] ---[ end trace d3d09e80c69966aa ]---
[  324.246314] Ramdump(ramdump_wcss_msa0): No consumers. Aborting..
[  324.251270] ------------[ cut here ]------------
[  324.251347] WARNING: CPU: 3 PID: 540 at /home/stefan/android/source/kernel/sony/msm-4.14/kernel/drivers/soc/qcom/smem.c:652 qcom_smem_get+0x170/0x18c
[  324.251880] ---[ end trace d3d09e80c69966ab ]---
[  325.670442] qrtr: Modem QMI Readiness RX cmd:0x2 node[0x0]
[  325.672612] rmt_storage:INFO:rmt_storage_open_cb: Processing: Open Request for /boot/modem_fs1!
[  325.673665] rmt_storage:INFO:rmt_storage_open_cb: Processing: Open Request for /boot/modem_fs2!
[  325.674770] rmt_storage:INFO:rmt_storage_open_cb: Processing: Open Request for /boot/modem_fsg!
[  325.676388] rmt_storage:INFO:rmt_storage_open_cb: Processing: Open Request for /boot/modem_fsc!
[  325.680048] ipa-wan ipa3_wwan_probe:2610 rmnet_ipa completed initialization
[  325.681990] rmt_storage:INFO:rmt_storage_alloc_buff_cb: elapsed time(ms) is 47619 >= 5000
[  325.682607] rmt_storage:INFO:rmt_storage_alloc_buff_cb: Received req_size: 2097152! avail: 2097152
[  325.685058] rmt_storage:INFO:rmt_storage_rw_iovec_cb: Read iovec request received for /boot/modem_fs1
[  325.686312] rmt_storage:INFO:rmt_storage_client_thread: Calling Read [offset=512, size=512]for /boot/modem_fs1!
[  325.686394] rmt_storage:INFO:rmt_storage_client_thread: Done Read (bytes = 512) for /boot/modem_fs1!
[  325.688398] rmt_storage:INFO:rmt_storage_rw_iovec_cb: Read iovec request received for /boot/modem_fs2
[  325.689894] rmt_storage:INFO:rmt_storage_client_thread: Calling Read [offset=512, size=512]for /boot/modem_fs2!
[  325.690033] rmt_storage:INFO:rmt_storage_client_thread: Done Read (bytes = 512) for /boot/modem_fs2!
[  325.692845] rmt_storage:INFO:rmt_storage_rw_iovec_cb: Read iovec request received for /boot/modem_fs2
[  325.694383] rmt_storage:INFO:rmt_storage_client_thread: Calling Read [offset=1024, size=2096128]for /boot/modem_fs2!
[  325.695282] rmt_storage:INFO:rmt_storage_client_thread: Done Read (bytes = 2096128) for /boot/modem_fs2!
[  325.826381] diag: In diag_send_peripheral_buffering_mode, buffering flag not set for 0
[  325.828030] diag: In diag_send_peripheral_buffering_mode, buffering flag not set for 0
[  325.913095] ipa-wan ipa3_handle_indication_req:117 not send indication
[  325.964245] Fatal error on the modem.
[  325.964383] modem subsystem failure reason: ExIPC: Exception recieved tid=144 inst=0 badva=8cf34fb0.
[  325.981115] kernel BUG at /home/stefan/android/source/kernel/sony/msm-4.14/kernel/lib/genalloc.c:392!
[  325.981167] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[  325.981251] Modules linked in:
[  325.981299] Process kworker/u17:7 (pid: 540, stack limit = 0xffffffc0ec840000)
[  325.981387] CPU: 1 PID: 540 Comm: kworker/u17:7 Tainted: G S      W       4.14.184-24205-g2584abc5fa70-dirty #1
[  325.981470] Hardware name: SoMC Lilac-ROW(MSM8998 v2.1) (DT)
[...]
[  325.984051] Call trace:
[  325.984137]  gen_pool_free+0x1d4/0x1dc
[  325.984184]  __dma_free+0x74/0xbc
[  325.984234]  ipa3_teardown_sys_pipe+0x948/0x10f0
[  325.984322]  ipa3_wwan_remove+0x98/0x540
[  325.984373]  platform_drv_remove+0x24/0x4c
[  325.984421]  device_release_driver_internal+0x130/0x1d4
[  325.984506]  driver_detach+0x68/0xa4
[  325.984554]  bus_remove_driver+0x74/0xa4
[  325.984602]  driver_unregister+0x2c/0x4c
[  325.984689]  platform_driver_unregister+0x10/0x18
[  325.984736]  ipa3_ssr_notifier_cb+0x114/0x50c
[  325.984824]  srcu_notifier_call_chain+0x5c/0xa0
[  325.984872]  subsys_notif_queue_notification+0x28/0x30
[  325.984920]  notify_each_subsys_device+0x268/0x3b0
[  325.985005]  subsystem_restart_wq_func+0x214/0x704
[  325.985054]  process_one_work+0x1fc/0x460
[  325.985140]  worker_thread+0x2e8/0x44c
[  325.985187]  kthread+0x118/0x128
[  325.985235]  ret_from_fork+0x10/0x18
[...]
[  326.028258] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G S    D W       4.14.184-24205-g2584abc5fa70-dirty #1
[  326.028342] Hardware name: SoMC Lilac-ROW(MSM8998 v2.1) (DT)
[  326.028387] Call trace:
[  326.028437]  dump_backtrace+0x0/0x1b4
[  326.028521]  show_stack+0x14/0x1c
[  326.028569]  dump_stack+0xd4/0x10c
[  326.028617]  ipi_cpu_stop+0x74/0x9c
[  326.028702]  ipi_cpu_stop+0x0/0x9c
[  326.028749]  gic_handle_irq+0x128/0x1b8
[  326.028796]  el1_irq+0xf0/0x194
[  326.028843]  0xffffffc0f854fd20
[  326.028928]  lpm_cpuidle_enter+0x4fc/0x538
[  326.028975]  cpuidle_enter_state+0x1c0/0x338
[  326.029022]  cpuidle_enter+0x18/0x20
[  326.029108]  do_idle+0x1a0/0x288
[  326.029155]  cpu_startup_entry+0x20/0x24
[  326.029202]  secondary_start_kernel+0x114/0x120
[  327.045344] SMP: failed to stop secondary CPUs 0-7
[  327.045545] ipahal ipahal_print_all_regs:3184 invalid IPA HW type (11)
[  327.445665] cpr3_panic_callback: PWR_CPRH_STATUS[0x179cbaa4] = 0x0048ce15
[  327.445713] cpr3_panic_callback: APCLUS0_L2_SAW4_PMIC_STS[0x17912c18] = 0x00040338
[  327.445800] cpr3_panic_callback: PERF_CPRH_STATUS[0x179c7aa4] = 0x0060ed1d
[  327.445887] cpr3_panic_callback: APCLUS1_L2_SAW4_PMIC_STS[0x17812c18] = 0x000403b4
[  327.445939] clk_osm_panic_callback: WDOG_DOMAIN_PSTATE_STATUS_0=0x00000000
[  327.446027] clk_osm_panic_callback: WDOG_PROGRAM_COUNTER_0=0x00000000
[  327.446114] clk_osm_panic_callback: APM_STATUS_0=0xff000303
[  327.446163] clk_osm_panic_callback: WDOG_DOMAIN_PSTATE_STATUS_1=0x00000000
[  327.446251] clk_osm_panic_callback: WDOG_PROGRAM_COUNTER_1=0x00000000
[  327.446300] clk_osm_panic_callback: APM_STATUS_1=0x1f000003
[  327.446394] ipa ipa3_active_clients_panic_notifier:262 
[  327.446394] ---- Active Clients Table ----
[  327.446394] PROXY_CLK_VOTE                           1   SPECIAL
[  327.446394] IPA_CLIENT_APPS_WAN_CONS                 2   ENDPOINT
[  327.446394] ipa3_handle_rx                           1   SIMPLE
[  327.446394] FREEZE_VOTE                              1   SPECIAL
[  327.446394] TAG_PROCESS                              -11 SPECIAL
[  327.446394] 
[  327.446394] Total active clients count: 5
[  327.446394] 
[  327.446608] Kernel Offset: disabled
[  327.446654] CPU features: 0x0008200c
[  327.446699] Memory Limit: none
[  327.466744] Rebooting in 5 seconds..
[  332.466878] SMP: stopping secondary CPUs
[  333.508593] SMP: failed to stop secondary CPUs 0-7
[  333.508678] Going down for restart now

There are also other warnings/callstacks in the attached files:

[   13.661048] Unbalanced enable for IRQ 91
[   13.661112] ------------[ cut here ]------------
[   13.661204] WARNING: CPU: 2 PID: 64 at /home/stefan/android/source/kernel/sony/msm-4.14/kernel/kernel/irq/manage.c:523 enable_irq+0x90/0xe4
[   13.661289] Modules linked in:
[   13.661340] CPU: 2 PID: 64 Comm: kworker/2:1 Tainted: G S              4.14.184-24205-g2584abc5fa70-dirty #1
[...]
[   13.664065] Call trace:
[   13.664112]  enable_irq+0x90/0xe4
[   13.664157]  wled_ovp_work+0x10/0x18
[   13.664206]  process_one_work+0x1fc/0x460
[   13.664254]  worker_thread+0x2e8/0x44c
[   13.664300]  kthread+0x118/0x128
[   13.664347]  ret_from_fork+0x10/0x18
<6>[  282.930354] [ip][0x14f23a6fb][08:36:16.516297] wlan: [7307:E:HDD] __hdd_validate_adapter: 864: bad adapter magic (via __wlan_hdd_ipv4_changed)
<4>[  283.031132] HTB: quantum of class 10001 is big. Consider r2q change.
<4>[  283.056590] HTB: quantum of class 10010 is big. Consider r2q change.
<6>[  283.104288] [netmgrd][0x14f569c00][08:36:16.690230] wlan: [830:E:HDD] __hdd_validate_adapter: 864: bad adapter magic (via __wlan_hdd_ipv6_changed)
<4>[  283.108103] ------------[ cut here ]------------
<4>[  283.108167] WARNING: CPU: 0 PID: 58 at /home/stefan/android/source/kernel/sony/msm-4.14/kernel/kernel/softirq.c:173 __local_bh_enable_ip+0x98/0x108
<6>[  283.108256] Modules linked in:
<6>[  283.108263] CPU: 0 PID: 58 Comm: kworker/0:1 Tainted: G S      W       4.14.184-24205-g2584abc5fa70-dirty #1
<6>[  283.108265] Hardware name: SoMC Lilac-ROW(MSM8998 v2.1) (DT)
[...]
<6>[  283.108473] Call trace:
<6>[  283.108477]  __local_bh_enable_ip+0x98/0x108
<6>[  283.108482]  ipa3_send+0x68c/0xa40
<6>[  283.108486]  ipa3_tx_dp+0x80c/0x9d0
<6>[  283.108490]  ipa3_wwan_xmit+0x2d4/0x4a4
<6>[  283.108495]  dev_hard_start_xmit+0x144/0x294
<6>[  283.108499]  sch_direct_xmit+0x160/0x1a8
<6>[  283.108502]  __qdisc_run+0x3b4/0x4c0
<6>[  283.108505]  net_tx_action+0x218/0x244
<6>[  283.108509]  __do_softirq+0x1ec/0x3a0
<6>[  283.108511]  __local_bh_enable_ip+0xfc/0x108
<6>[  283.108515]  _raw_spin_unlock_bh+0x20/0x28
<6>[  283.108518]  ipa3_wake_tx_queue+0xd4/0xe0
<6>[  283.108522]  process_one_work+0x1fc/0x460
<6>[  283.108525]  worker_thread+0x2e8/0x44c
<6>[  283.108529]  kthread+0x118/0x128
<6>[  283.108532]  ret_from_fork+0x10/0x18
<4>[  283.108535] ---[ end trace d3d09e80c69966a7 ]---
<6>[  283.118308] [soft_irq][0x14f5ab760][08:36:16.704249] wlan: [0:E:HDD] __hdd_validate_adapter: 864: bad adapter magic (via __wlan_hdd_ipv6_changed)
<6>[  283.219167] binder: undelivered transaction 221148, process died.
<6>[  283.219574] binder: undelivered transaction 221047, process died.
<6>[  283.219579] binder: 1070:1408 transaction failed 29189/-22, size 116-0 line 3123
<6>[  283.219580] binder: undelivered transaction 221091, process died.
<6>[  283.219582] binder: undelivered transaction 221093, process died.
<6>[  283.219583] binder: undelivered transaction 221094, process died.
<6>[  283.219587] binder: send failed reply for transaction 221168 to 1070:1414
<6>[  283.219749] binder: 1070:1408 transaction failed 29189/-22, size 116-0 line 3123
<3>[  321.959246] Fatal error on the modem.
<3>[  321.959333] modem subsystem failure reason: dog_hb.c:266:DOG_HB detects starvation of task 0xbced5ee0, triage with its own.
<6>[  321.959485] subsys-restart: subsystem_restart_dev(): Restart sequence requested for modem, restart_level = RELATED.
<6>[  321.964800] [kworker/u17:7][0x17bcf865d][08:36:55.550742] wlan: [540:I:HDD] wlan_hdd_pld_uevent: 1603: pld event 1
<6>[  321.964861] [kworker/u17:7][0x17bcf8b04][08:36:55.550804] wlan: [540:I:WMI] WMI Stop
<6>[  321.964861] 
<6>[  321.965098] ipa-wan ipa3_ssr_notifier_cb:2852 IPA received MPSS BEFORE_SHUTDOWN
<6>[  321.972420] ipa-wan ipa3_wwan_remove:2650 rmnet_ipa started deinitialization
<6>[  322.720036] NETDEV WATCHDOG: rmnet_ipa0 (): transmit queue 0 timed out
<4>[  322.720067] ------------[ cut here ]------------
<4>[  322.720181] WARNING: CPU: 7 PID: 1977 at /home/stefan/android/source/kernel/sony/msm-4.14/kernel/net/sched/sch_generic.c:320 dev_watchdog+0x298/0x2b0
<6>[  322.720278] Modules linked in:
<6>[  322.720284] CPU: 7 PID: 1977 Comm: .gms.persistent Tainted: G S      W       4.14.184-24205-g2584abc5fa70-dirty #1
<6>[  322.720286] Hardware name: SoMC Lilac-ROW(MSM8998 v2.1) (DT)
[...]
6>[  322.720430] Call trace:
<6>[  322.720433]  dev_watchdog+0x298/0x2b0
<6>[  322.720438]  call_timer_fn+0xc0/0x1b8
<6>[  322.720441]  run_timer_softirq+0x248/0x870
<6>[  322.720444]  __do_softirq+0x1ec/0x3a0
<6>[  322.720447]  irq_exit+0xf4/0x100
<6>[  322.720450]  __handle_domain_irq+0x84/0xbc
<6>[  322.720452]  gic_handle_irq+0x168/0x1b8
<6>[  322.720454]  el0_irq_naked+0x4c/0x54
<6>[  322.720456]  0xffffffc094127ec0
<4>[  322.720457] ---[ end trace d3d09e80c69966a8 ]---
<3>[  322.976316] gsi soc:qcom,msm_gsi: gsi_stop_channel:2796 chan=8 unexpected state=0
<3>[  322.976820] gsi soc:qcom,msm_gsi: gsi_stop_channel:2802 MSM8998 workaround for gsi firmware: avoiding kernel panic for already not allocated CH.
<3>[  322.979182] gsi soc:qcom,msm_gsi: gsi_reset_channel:2916 bad state 0
<3>[  322.979305] gsi soc:qcom,msm_gsi: gsi_reset_channel:2921 MSM8998 workaround for gsi firmware: avoiding kernel panic for already not allocated CH.
<3>[  322.979435] gsi soc:qcom,msm_gsi: gsi_dealloc_channel:3011 bad state 0
<3>[  322.979503] gsi soc:qcom,msm_gsi: gsi_dealloc_channel:3016 MSM8998 workaround for gsi firmware: avoiding kernel panic for already not allocated CH.
<3>[  322.979742] gsi soc:qcom,msm_gsi: gsi_dealloc_evt_ring:1729 1 channels still using this event ring
<6>[  322.979810] deall evt -7
<4>[  322.979867] ------------[ cut here ]------------
<4>[  322.979990] WARNING: CPU: 2 PID: 540 at /home/stefan/android/source/kernel/sony/msm-4.14/kernel/drivers/platform/msm/ipa/ipa_v3/ipa_dp.c:1446 ipa3_teardown_sys_pipe+0x10e4/0x10f0
<6>[  322.980105] Modules linked in:
<6>[  322.980115] CPU: 2 PID: 540 Comm: kworker/u17:7 Tainted: G S      W       4.14.184-24205-g2584abc5fa70-dirty #1
<6>[  322.980117] Hardware name: SoMC Lilac-ROW(MSM8998 v2.1) (DT)
<6>[  322.980127] Workqueue: ssr_wq subsystem_restart_wq_func
<6>[  322.980133] task: ffffffc0edfad700 task.stack: ffffffc0ec840000
<6>[  322.980138] pc : ipa3_teardown_sys_pipe+0x10e4/0x10f0
<6>[  322.980142] lr : ipa3_teardown_sys_pipe+0x10e4/0x10f0
<6>[  322.980145] sp : ffffffc0ec8439f0 pstate : 60400145
[...]
<6>[  322.980445] Call trace:
<6>[  322.980450]  ipa3_teardown_sys_pipe+0x10e4/0x10f0
<6>[  322.980457]  ipa3_wwan_remove+0x98/0x540
<6>[  322.980464]  platform_drv_remove+0x24/0x4c
<6>[  322.980470]  device_release_driver_internal+0x130/0x1d4
<6>[  322.980475]  driver_detach+0x68/0xa4
<6>[  322.980480]  bus_remove_driver+0x74/0xa4
<6>[  322.980486]  driver_unregister+0x2c/0x4c
<6>[  322.980490]  platform_driver_unregister+0x10/0x18
<6>[  322.980496]  ipa3_ssr_notifier_cb+0x114/0x50c
<6>[  322.980503]  srcu_notifier_call_chain+0x5c/0xa0
<6>[  322.980508]  subsys_notif_queue_notification+0x28/0x30
<6>[  322.980513]  notify_each_subsys_device+0x268/0x3b0
<6>[  322.980517]  subsystem_restart_wq_func+0x214/0x704
<6>[  322.980523]  process_one_work+0x1fc/0x460
<6>[  322.980528]  worker_thread+0x2e8/0x44c
<6>[  322.980533]  kthread+0x118/0x128
<6>[  322.980538]  ret_from_fork+0x10/0x18
<4>[  322.980541] ---[ end trace d3d09e80c69966a9 ]---
<3>[  322.980687] ipa-wan ipa3_wwan_remove:2654 Failed to teardown IPA->APPS pipe
<3>[  323.984298] gsi soc:qcom,msm_gsi: gsi_stop_channel:2796 chan=5 unexpected state=0
<3>[  323.984402] gsi soc:qcom,msm_gsi: gsi_stop_channel:2802 MSM8998 workaround for gsi firmware: avoiding kernel panic for already not allocated CH.
<3>[  323.986086] gsi soc:qcom,msm_gsi: gsi_reset_channel:2916 bad state 0
<3>[  323.986182] gsi soc:qcom,msm_gsi: gsi_reset_channel:2921 MSM8998 workaround for gsi firmware: avoiding kernel panic for already not allocated CH.
<3>[  323.986281] gsi soc:qcom,msm_gsi: gsi_dealloc_channel:3011 bad state 0
<3>[  323.986332] gsi soc:qcom,msm_gsi: gsi_dealloc_channel:3016 MSM8998 workaround for gsi firmware: avoiding kernel panic for already not allocated CH.
<6>[  323.986537] ipa-wan ipa3_wwan_remove:2665 rmnet_ipa unregister_netdev
<3>[  323.986619] ipa-wan __ipa_wwan_close:1049 [rmnet_ipa0]: ipa3_deregister_intf failed -22
<3>[  323.988924] [RMNET:HI] rmnet_config_notify_cb(): Kernel is trying to unregister rmnet_ipa0
<3>[  323.990339] [RMNET:HI] rmnet_config_notify_cb(): Kernel is trying to unregister rmnet_data0
<3>[  324.002499] [RMNET:HI] rmnet_config_notify_cb(): Kernel is trying to unregister rmnet_ipa0
<3>[  324.002947] ipa-wan ipa3_del_dflt_wan_rt_tables:437 Deleting Route hdl:(0x7) with ip type: 0
<3>[  324.005051] ipa-wan ipa3_del_dflt_wan_rt_tables:446 Deleting Route hdl:(0x9) with ip type: 1
<6>[  324.010042] ipa-wan ipa3_wwan_remove:2689 rmnet_ipa completed deinitialization
<6>[  324.010233] ipa-wan ipa3_ssr_notifier_cb:2869 IPA BEFORE_SHUTDOWN handling is complete
<6>[  324.010247] icnss: PD service down, pd_state: 2, state: 0x324d8f: cause: Root PD crashed
<6>[  324.010419] [kworker/u16:3][0x17e26d3c9][08:36:57.596361] wlan: [250:I:HDD] wlan_hdd_pld_uevent: 1603: pld event 0
<6>[  324.010444] [kworker/u16:3][0x17e26d5be][08:36:57.596387] wlan: [250:W:IPA] wlan_ipa_uc_disable_pipes: 1043: IPA WDI Pipes are already deactivated
<6>[  324.010499] subsys-restart: subsystem_shutdown(): [kworker/u17:7:540]: Shutting down modem
<3>[  324.011400] [RMNET:HI] rmnet_config_notify_cb(): Kernel is trying to unregister rmnet_data0
<3>[  324.012328] [RMNET:HI] rmnet_config_notify_cb(): Kernel is trying to unregister rmnet_data1
<3>[  324.022192] [RMNET:HI] rmnet_config_notify_cb(): Kernel is trying to unregister rmnet_data1
<3>[  324.023132] [RMNET:HI] rmnet_config_notify_cb(): Kernel is trying to unregister rmnet_data2
<3>[  324.035358] [RMNET:HI] rmnet_config_notify_cb(): Kernel is trying to unregister rmnet_data2
<3>[  324.036267] [RMNET:HI] rmnet_config_notify_cb(): Kernel is trying to unregister rmnet_data3
<3>[  324.048084] [RMNET:HI] rmnet_config_notify_cb(): Kernel is trying to unregister rmnet_data3
<3>[  324.048934] [RMNET:HI] rmnet_config_notify_cb(): Kernel is trying to unregister rmnet_data4
<3>[  324.057195] [RMNET:HI] rmnet_config_notify_cb(): Kernel is trying to unregister rmnet_data4
<3>[  324.058540] [RMNET:HI] rmnet_config_notify_cb(): Kernel is trying to unregister rmnet_data5
<3>[  324.068470] [RMNET:HI] rmnet_config_notify_cb(): Kernel is trying to unregister rmnet_data5
<3>[  324.069555] [RMNET:HI] rmnet_config_notify_cb(): Kernel is trying to unregister rmnet_data6
<3>[  324.081564] [RMNET:HI] rmnet_config_notify_cb(): Kernel is trying to unregister rmnet_data6
<3>[  324.082275] [RMNET:HI] rmnet_config_notify_cb(): Kernel is trying to unregister rmnet_data7
<3>[  324.093593] [RMNET:HI] rmnet_config_notify_cb(): Kernel is trying to unregister rmnet_data7
<4>[  324.110523] pil-q6v5-mss 4080000.qcom,mss: Port 0000000000000000 halt timeout
<4>[  324.210718] pil-q6v5-mss 4080000.qcom,mss: Port 0000000000000000 halt timeout
<6>[  324.212160] gcc_mss_mnoc_bimc_axi_clk
<6>[  324.212164]                 CBCR: 0x00000000
<6>[  324.212167] gcc_mss_mnoc_bimc_axi_clk: status stuck at 'on'
<4>[  324.212198] ------------[ cut here ]------------
<4>[  324.212309] WARNING: CPU: 3 PID: 540 at /home/stefan/android/source/kernel/sony/msm-4.14/kernel/drivers/clk/qcom/clk-branch.c:116 clk_branch_toggle+0x13c/0x184
<6>[  324.212420] Modules linked in:
<6>[  324.212429] CPU: 3 PID: 540 Comm: kworker/u17:7 Tainted: G S      W       4.14.184-24205-g2584abc5fa70-dirty #1
<6>[  324.212431] Hardware name: SoMC Lilac-ROW(MSM8998 v2.1) (DT)
<6>[  324.212440] Workqueue: ssr_wq subsystem_restart_wq_func
<6>[  324.212444] task: ffffffc0edfad700 task.stack: ffffffc0ec840000
<6>[  324.212448] pc : clk_branch_toggle+0x13c/0x184
<6>[  324.212451] lr : clk_branch_toggle+0x138/0x184
<6>[  324.212453] sp : ffffffc0ec843a70 pstate : 604001c5
[...]
<6>[  324.212693] Call trace:
<6>[  324.212697]  clk_branch_toggle+0x13c/0x184
<6>[  324.212701]  clk_branch2_disable+0x18/0x20
<6>[  324.212705]  clk_core_disable+0xf4/0x1e0
<6>[  324.212709]  clk_disable+0x94/0x114
<6>[  324.212714]  pil_mss_disable_clks+0x1c/0x9c
<6>[  324.212717]  pil_mss_shutdown+0x278/0x2d8
<6>[  324.212720]  pil_shutdown+0x3c/0x178
<6>[  324.212723]  modem_shutdown+0x88/0x9c
<6>[  324.212727]  subsystem_restart_wq_func+0x278/0x704
<6>[  324.212732]  process_one_work+0x1fc/0x460
<6>[  324.212736]  worker_thread+0x2e8/0x44c
<6>[  324.212740]  kthread+0x118/0x128
<6>[  324.212745]  ret_from_fork+0x10/0x18
<4>[  324.212748] ---[ end trace d3d09e80c69966aa ]---
<6>[  324.232273] icnss: WLAN FW Service Disconnected: 0x324f8b
<6>[  324.232280] icnss_qmi: QMI Service Disconnected: 0x324f8b
<6>[  324.232899] service-notifier: service_notifier_del_server: Connection lost between QMI handle and 180 service
<6>[  324.242770] sysmon-qmi: ssctl_del_server: Connection lost between QMI handle and modem's SSCTL service
<6>[  324.246306] icnss: Collecting msa0 segment dump
<3>[  324.246314] Ramdump(ramdump_wcss_msa0): No consumers. Aborting..
<6>[  324.250185] ipa-wan ipa3_ssr_notifier_cb:2872 IPA Received MPSS AFTER_SHUTDOWN
<6>[  324.251205] ipa-wan ipa3_ssr_notifier_cb:2880 IPA AFTER_SHUTDOWN handling is complete
<4>[  324.251270] ------------[ cut here ]------------
<4>[  324.251347] WARNING: CPU: 3 PID: 540 at /home/stefan/android/source/kernel/sony/msm-4.14/kernel/drivers/soc/qcom/smem.c:652 qcom_smem_get+0x170/0x18c
<6>[  324.251510] Modules linked in:
<6>[  324.251519] CPU: 3 PID: 540 Comm: kworker/u17:7 Tainted: G S      W       4.14.184-24205-g2584abc5fa70-dirty #1
<6>[  324.251521] Hardware name: SoMC Lilac-ROW(MSM8998 v2.1) (DT)
<6>[  324.251531] Workqueue: ssr_wq subsystem_restart_wq_func
<6>[  324.251535] task: ffffffc0edfad700 task.stack: ffffffc0ec840000
<6>[  324.251540] pc : qcom_smem_get+0x170/0x18c
<6>[  324.251545] lr : microdump_modem_notifier_nb+0x70/0x110
<6>[  324.251547] sp : ffffffc0ec843b40 pstate : 80400145
<6>[  324.251830] Call trace:
<6>[  324.251835]  qcom_smem_get+0x170/0x18c
<6>[  324.251840]  microdump_modem_notifier_nb+0x70/0x110
<6>[  324.251846]  srcu_notifier_call_chain+0x5c/0xa0
<6>[  324.251850]  subsys_notif_queue_notification+0x28/0x30
<6>[  324.251854]  notify_each_subsys_device+0x268/0x3b0
<6>[  324.251858]  subsystem_restart_wq_func+0x304/0x704
<6>[  324.251864]  process_one_work+0x1fc/0x460
<6>[  324.251868]  worker_thread+0x2e8/0x44c
<6>[  324.251872]  kthread+0x118/0x128
<6>[  324.251877]  ret_from_fork+0x10/0x18
<4>[  324.251880] ---[ end trace d3d09e80c69966ab ]---
<6>[  324.251943] microdump_modem_notifier_nb: smem 611 not available
<6>[  324.253499] ipa-wan ipa3_ssr_notifier_cb:2883 IPA received MPSS BEFORE_POWERUP
<6>[  324.254131] ipa-wan ipa3_ssr_notifier_cb:2890 IPA BEFORE_POWERUP handling is complete
<6>[  324.254140] subsys-restart: subsystem_powerup(): [kworker/u17:7:540]: Powering up modem
<14>[  324.258316] ueventd: firmware: loading 'modem.mdt' for '/devices/platform/soc/4080000.qcom,mss/firmware/modem.mdt'
<6>[  324.258945] pil-q6v5-mss 4080000.qcom,mss: modem: loading from 0x000000008cc00000 to 0x0000000093c00000
<14>[  324.260081] ueventd: loading /devices/platform/soc/4080000.qcom,mss/firmware/modem.mdt took 1ms
<6>[  324.357857] pil-q6v5-mss 4080000.qcom,mss: MBA boot done
<14>[  324.584278] ueventd: firmware: loading 'modem.b02' for '/devices/platform/soc/4080000.qcom,mss/firmware/modem.b02'
<14>[  324.587873] ueventd: loading /devices/platform/soc/4080000.qcom,mss/firmware/modem.b02 took 3ms
[...]
<14>[  324.802016] ueventd: firmware: loading 'modem.b27' for '/devices/platform/soc/4080000.qcom,mss/firmware/modem.b27'
<14>[  324.805476] ueventd: loading /devices/platform/soc/4080000.qcom,mss/firmware/modem.b27 took 3ms
<6>[  324.805894] [wificond][0x17f0fe077][08:36:58.391837] wlan: [789:E:HDD] __wlan_hdd_cfg80211_scan: 487: firmware is down, scan cmd cannot be processed
<14>[  324.806942] ueventd: firmware: loading 'modem.b28' for '/devices/platform/soc/4080000.qcom,mss/firmware/modem.b28'
<14>[  324.808537] ueventd: loading /devices/platform/soc/4080000.qcom,mss/firmware/modem.b28 took 1ms
<6>[  325.536256] pil-q6v5-mss 4080000.qcom,mss: modem: Brought out of reset
<6>[  325.667868] pil-q6v5-mss 4080000.qcom,mss: Subsystem error monitoring/handling services are up
<6>[  325.667884] pil-q6v5-mss 4080000.qcom,mss: modem: Power/Clock ready interrupt received
<6>[  325.669710] ipa-wan ipa3_ssr_notifier_cb:2893 IPA received MPSS AFTER_POWERUP
<3>[  325.670442] qrtr: Modem QMI Readiness RX cmd:0x2 node[0x0]
<6>[  325.671175] rmnet_ipa3 started initialization
<6>[  325.671179] IPA SSR support = True
<6>[  325.671180] IPA is_platform_type_msm = True
<6>[  325.671182] IPA SG support = True
<6>[  325.671184] IPA Napi Enable = True
<6>[  325.671186] using default for wan-rx-desc-size = 256
<6>[  325.671690] service-notifier: service_notifier_new_server: Connection established between QMI handle and 180 service
<6>[  325.671917] sysmon-qmi: ssctl_new_server: Connection established between QMI handle and modem's SSCTL service
<11>[  325.672612] rmt_storage:INFO:rmt_storage_open_cb: Processing: Open Request for /boot/modem_fs1!
<11>[  325.673665] rmt_storage:INFO:rmt_storage_open_cb: Processing: Open Request for /boot/modem_fs2!
<11>[  325.674770] rmt_storage:INFO:rmt_storage_open_cb: Processing: Open Request for /boot/modem_fsg!
<11>[  325.676388] rmt_storage:INFO:rmt_storage_open_cb: Processing: Open Request for /boot/modem_fsc!
<3>[  325.680048] ipa-wan ipa3_wwan_probe:2610 rmnet_ipa completed initialization
<6>[  325.680950] ipa-wan ipa3_ssr_notifier_cb:2898 IPA AFTER_POWERUP handling is complete
<6>[  325.680986] subsys-restart: subsystem_restart_wq_func(): [kworker/u17:7:540]: Restart sequence for modem completed.
<11>[  325.681990] rmt_storage:INFO:rmt_storage_alloc_buff_cb: elapsed time(ms) is 47619 >= 5000
<11>[  325.682607] rmt_storage:INFO:rmt_storage_alloc_buff_cb: Received req_size: 2097152! avail: 2097152
<11>[  325.685058] rmt_storage:INFO:rmt_storage_rw_iovec_cb: Read iovec request received for /boot/modem_fs1
<11>[  325.686312] rmt_storage:INFO:rmt_storage_client_thread: Calling Read [offset=512, size=512]for /boot/modem_fs1!
<11>[  325.686394] rmt_storage:INFO:rmt_storage_client_thread: Done Read (bytes = 512) for /boot/modem_fs1!
<11>[  325.688398] rmt_storage:INFO:rmt_storage_rw_iovec_cb: Read iovec request received for /boot/modem_fs2
<11>[  325.689894] rmt_storage:INFO:rmt_storage_client_thread: Calling Read [offset=512, size=512]for /boot/modem_fs2!
<11>[  325.690033] rmt_storage:INFO:rmt_storage_client_thread: Done Read (bytes = 512) for /boot/modem_fs2!
<11>[  325.692845] rmt_storage:INFO:rmt_storage_rw_iovec_cb: Read iovec request received for /boot/modem_fs2
<11>[  325.694383] rmt_storage:INFO:rmt_storage_client_thread: Calling Read [offset=1024, size=2096128]for /boot/modem_fs2!
<11>[  325.695282] rmt_storage:INFO:rmt_storage_client_thread: Done Read (bytes = 2096128) for /boot/modem_fs2!
<6>[  325.820940] memshare_child soc:qcom,memshare:qcom,client_3: memshare_query: client_id : 1, query_resp.size :5242880, query_resp.resp.result :0
<6>[  325.821387] memshare_child soc:qcom,memshare:qcom,client_3: memshare_alloc: memory alloc request received for client id: 1, proc_id: 0, request size: 5242880
<6>[  325.821394] memshare_child soc:qcom,memshare:qcom,client_3: memshare_alloc: client_id: 1, alloc_resp.num_bytes: 5242880, alloc_resp.resp.result: 0
<3>[  325.826381] diag: In diag_send_peripheral_buffering_mode, buffering flag not set for 0
<3>[  325.828030] diag: In diag_send_peripheral_buffering_mode, buffering flag not set for 0
<6>[  325.830992] Sending QMI_IPA_INIT_MODEM_DRIVER_REQ_V01
<3>[  325.913095] ipa-wan ipa3_handle_indication_req:117 not send indication
<3>[  325.964245] Fatal error on the modem.
<3>[  325.964383] modem subsystem failure reason: ExIPC: Exception recieved tid=144 inst=0 badva=8cf34fb0.
<6>[  325.964478] subsys-restart: subsystem_restart_dev(): Restart sequence requested for modem, restart_level = RELATED.
<6>[  325.968769] ipa-wan ipa3_ssr_notifier_cb:2852 IPA received MPSS BEFORE_SHUTDOWN
<6>[  325.972818] ipa-wan ipa3_wwan_remove:2650 rmnet_ipa started deinitialization
<6>[  325.981109] ------------[ cut here ]------------
<2>[  325.981115] kernel BUG at /home/stefan/android/source/kernel/sony/msm-4.14/kernel/lib/genalloc.c:392!
<0>[  325.981167] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
<6>[  325.981251] Modules linked in:
<0>[  325.981299] Process kworker/u17:7 (pid: 540, stack limit = 0xffffffc0ec840000)
<6>[  325.981387] CPU: 1 PID: 540 Comm: kworker/u17:7 Tainted: G S      W       4.14.184-24205-g2584abc5fa70-dirty #1
<6>[  325.981470] Hardware name: SoMC Lilac-ROW(MSM8998 v2.1) (DT)
<6>[  325.981525] Workqueue: ssr_wq subsystem_restart_wq_func
<6>[  325.981611] task: ffffffc0edfad700 task.stack: ffffffc0ec840000
<6>[  325.981659] pc : gen_pool_free+0x1d4/0x1dc
<6>[  325.981705] lr : gen_pool_free+0x30/0x1dc
<6>[  325.981790] sp : ffffffc0ec843970 pstate : 00400145
[...]
<6>[  325.984051] Call trace:
<6>[  325.984137]  gen_pool_free+0x1d4/0x1dc
<6>[  325.984184]  __dma_free+0x74/0xbc
<6>[  325.984234]  ipa3_teardown_sys_pipe+0x948/0x10f0
<6>[  325.984322]  ipa3_wwan_remove+0x98/0x540
<6>[  325.984373]  platform_drv_remove+0x24/0x4c
<6>[  325.984421]  device_release_driver_internal+0x130/0x1d4
<6>[  325.984506]  driver_detach+0x68/0xa4
<6>[  325.984554]  bus_remove_driver+0x74/0xa4
<6>[  325.984602]  driver_unregister+0x2c/0x4c
<6>[  325.984689]  platform_driver_unregister+0x10/0x18
<6>[  325.984736]  ipa3_ssr_notifier_cb+0x114/0x50c
<6>[  325.984824]  srcu_notifier_call_chain+0x5c/0xa0
<6>[  325.984872]  subsys_notif_queue_notification+0x28/0x30
<6>[  325.984920]  notify_each_subsys_device+0x268/0x3b0
<6>[  325.985005]  subsystem_restart_wq_func+0x214/0x704
<6>[  325.985054]  process_one_work+0x1fc/0x460
<6>[  325.985140]  worker_thread+0x2e8/0x44c
<6>[  325.985187]  kthread+0x118/0x128
<6>[  325.985235]  ret_from_fork+0x10/0x18
<0>[  325.985322] Code: d4210000 14000000 d4210000 14000000 (d4210000) 
<4>[  325.985370] ---[ end trace d3d09e80c69966ac ]---

Additional context console-ramoops-0.log dmesg-ramoops-0.log dmesg-ramoops-1.log pmsg-ramoops-0.log


¹) Commit: 2584abc5fa70 can't be found on github.

²) SELinux, enforcing, the build includes following PR's

stefanhh0 commented 4 years ago

Kernel version: 4.14.185-24221-g074811d362f2¹ Android version: android-10.0.0_r39 Software binaries version: SW_binaries_for_Xperia_Android_10.0.7.1_r1_v8a_yoshino.img Version baseband: 1307-7511_47.2.A.11.228 Build: aosp_g8441-userdebug 10 QQ3A.200605.002.A1 eng.stefan.20200625.005443²

Another update, pulled PRs:

to see if boots are more stable. Actually it booted several times stable but then sometimes shaky modem shows up again.

After seeing quite a bunch of stable boots I also saw unstable ones with shaky modem / sensor trouble again:

Here a log where sensors / modem came up clean: logcat_good.log

Here a log where sensors / modem didn't came up clean: logcat_bad.log

After it the phone rebooted crashed and I could find some files in /sys/fs/pstore: console-ramoops-0.log dmesg-ramoops-0.log dmesg-ramoops-1.log pmsg-ramoops-0.log

The crash logs don't contain too much new from what I see, however should be double checked maybe I have overlooked something.


¹) Commit: 074811d362f2 can't be found on github.

²) SELinux: enforcing, the build includes following PR's

stefanhh0 commented 4 years ago

Will be fixed with: https://github.com/sonyxperiadev/kernel/pull/2309

stefanhh0 commented 4 years ago

Kernel version: 4.14.187-24291-ge795adf6f124¹ Android version: android-10.0.0_r39 Software binaries version: SW_binaries_for_Xperia_Android_10.0.7.1_r1_v9a_yoshino.img Version baseband: 1307-7511_47.2.A.11.228 Build: aosp_g8441-userdebug 10 QQ3A.200605.002.A1 eng.stefan.20200704.072127²

The issue is not completely fixed with sonyxperiadev/kernel#2309. However the phone was able to recover from the sensor null issue and shaky modem (going on and off) without a reboot (it was never the case before that fix). Find attached a log containing kernel and default logcat output.

Additional context logcat.log


¹) Commit: e795adf6f124 can't be found on github.

²) SELinux: enforcing, the build includes following PR's and local changes.

kholk commented 4 years ago

Please don't mix the kernel log with the logcat... It's tedious to read it like that... A lot... :p

stefanhh0 commented 4 years ago

Okay, on the next occurrence, I'll post two different again, one kernel one default.

MarijnS95 commented 4 years ago

That depends on the reader....

I prefer and have asked to provide logs like this, making it much easier to connect lines from userspace and kernel together. Not applicable to your kernel-only situation though :P

stefanhh0 commented 4 years ago

Kernel version: 4.14.187-24324-g8a84f57dd0e5¹ Android version: android-10.0.0_r40 Software binaries version: SW_binaries_for_Xperia_Android_10.0.7.1_r1_v8a_yoshino.img Version baseband: 1307-7511_47.2.A.11.228 Build: aosp_g8441-userdebug 10 QQ3A.200705.002 eng.stefan.20200707.220459²

Description I am posting a fresh set of pstore files, they are pretty similar to those posted before but also contain something that is maybe new, console-ramoops-0.log:

[    2.506103] qcom,qpnp-rtc 800f000.qcom,spmi:qcom,pm8998@0:qcom,pm8998_rtc: setting system clock to 1970-02-09 21:19:22 UTC (3446362)
[    2.507351] incell_control_mode: FINISH - incell.status:0x0
[    2.507444] clearpad clearpad: (clearpad_ctrl_session_begin:1388) begin 'post probe' session
[    2.507493] incell_power_lock: Lock is invalid during power off
[    2.507496] clearpad clearpad: (touchctrl_lock_power:1266) failed to lock power(-1) <post probe>
[    2.507782] lpm_levels: register_cluster_lpm_stats()
[    2.507903] lpm_levels: register_cluster_lpm_stats()
[    2.510579] RNDIS_IPA module is loaded.
[    2.510632] rmnet_ipa3 started initialization
[    2.512138] xo_board
[    2.512186] bi_tcxo
[    2.512273] gpll0
[    2.512320] gcc_mmss_gpll0_clk
[    2.512370]               EN_REG: 0x00000002
[    2.512417] ahb_clk_src
[    2.512506]             CMD_RCGR: 0x00000000
[    2.512555]             CFG_RCGR: 0x0000051d
[    2.512602] mmss_mnoc_ahb_clk
[    2.512691]                 CBCR: 0x00000000
[    2.512738] mmss_mnoc_ahb_clk: status stuck at 'on'
[    2.512755] ------------[ cut here ]------------
[    2.512894] WARNING: CPU: 5 PID: 1 at /home/stefan/android/source/kernel/sony/msm-4.14/kernel/drivers/clk/qcom/clk-branch.c:116 clk_branch_toggle+0x12c/0x158
[    2.512980] Modules linked in:
[    2.513031] CPU: 5 PID: 1 Comm: swapper/0 Tainted: G S              4.14.187-24324-g8a84f57dd0e5 #1
[...]
[    2.515685] Call trace:
[    2.515732]  clk_branch_toggle+0x12c/0x158
[    2.515781]  clk_branch2_disable+0x18/0x20
[    2.515870]  clk_core_disable+0xf4/0x1e0
[    2.515917]  clk_disable_unused_subtree+0xc4/0x3dc
[    2.515965]  clk_disable_unused_subtree+0x30/0x3dc
[    2.516053]  clk_disable_unused_subtree+0x30/0x3dc
[    2.516102]  clk_disable_unused_subtree+0x30/0x3dc
[    2.516190]  clk_disable_unused_subtree+0x30/0x3dc
[    2.516237]  clk_disable_unused_subtree+0x30/0x3dc
[    2.516285]  clk_disable_unused+0xb4/0x2d4
[    2.516375]  do_one_initcall+0xfc/0x1ac
[    2.516424]  kernel_init_freeable+0x14c/0x1e0
[    2.516474]  kernel_init+0x10/0x208
[    2.516561]  ret_from_fork+0x10/0x18
[    2.516608] ---[ end trace 9b7d2184534b9265 ]---

Additional context console-ramoops-0.log dmesg-ramoops-0.log dmesg-ramoops-1.log pmsg-ramoops-0.log


¹) Commit: 8a84f57dd0e5 can't be found on github.

²) SELinux: enforcing, the build includes following PR's and/or local changes.

stefanhh0 commented 4 years ago

Kernel version: 4.14.189-24379-g205a380bd16a¹ Android version: android-10.0.0_r40 Software binaries version: SW_binaries_for_Xperia_Android_10.0.7.1_r1_v9c_yoshino.img Version baseband: 1307-7511_47.2.A.11.228 Build: aosp_g8441-userdebug 10 QQ3A.200705.002 eng.stefan.20200728.091037²

An update on this issue. It re-occurred and I grabbed up-to-date logs with all the recent changes that has been done in order to stabilize yoshino. The new traces look different than the old ones and thus my hope is that there is additional information in it that helps getting this issue fixed.

<1>[  124.137314] BUG: Bad rss-counter state mm:ffffffc0ede51f80 idx:4 val:26
<3>[  125.860841] Fatal error on the modem.
<3>[  125.861043] modem subsystem failure reason: dog_hb.c:266:DOG_HB detects starvation of task 0xa0cd60c0, triage with its own.
<6>[  125.861234] subsys-restart: subsystem_restart_dev(): Restart sequence requested for modem, restart_level = RELATED.
<6>[  125.866144] ipa-wan ipa3_ssr_notifier_cb:2860 IPA received MPSS BEFORE_SHUTDOWN
<6>[  125.874136] ipa-wan ipa3_wwan_remove:2658 rmnet_ipa started deinitialization
<6>[  126.752168] NETDEV WATCHDOG: rmnet_ipa0 (): transmit queue 0 timed out
<4>[  126.752200] ------------[ cut here ]------------
<4>[  126.752287] WARNING: CPU: 4 PID: 0 at /home/stefan/android/source/kernel/sony/msm-4.14/kernel/net/sched/sch_generic.c:320 dev_watchdog+0x298/0x2b0
<6>[  126.752401] Modules linked in:
<6>[  126.752408] CPU: 4 PID: 0 Comm: swapper/4 Tainted: G S      W       4.14.189-24379-g205a380bd16a-dirty #1
<6>[  126.752409] Hardware name: SoMC Lilac-ROW(MSM8998 v2.1) (DT)
[...]
<6>[  126.752570] Call trace:
<6>[  126.752573]  dev_watchdog+0x298/0x2b0
<6>[  126.752579]  call_timer_fn+0xc0/0x24c
<6>[  126.752582]  run_timer_softirq+0x23c/0x850
<6>[  126.752585]  __do_softirq+0x1ec/0x3c4
<6>[  126.752588]  irq_exit+0xf4/0x100
<6>[  126.752591]  __handle_domain_irq+0x84/0xbc
<6>[  126.752593]  gic_handle_irq+0x168/0x1b8
<6>[  126.752595]  el1_irq+0xf0/0x194
<6>[  126.752597]  0xffffffc0f9563d20
<6>[  126.752602]  lpm_cpuidle_enter+0x4fc/0x538
<6>[  126.752605]  cpuidle_enter_state+0x1c0/0x338
<6>[  126.752608]  cpuidle_enter+0x18/0x20
<6>[  126.752611]  do_idle+0x194/0x264
<6>[  126.752613]  cpu_startup_entry+0x20/0x34
<6>[  126.752616]  secondary_start_kernel+0x114/0x120
<4>[  126.752617] ---[ end trace d43303481cd86696 ]---
<3>[  126.880169] gsi soc:qcom,msm_gsi: gsi_stop_channel:2796 chan=8 unexpected state=0
<3>[  126.880231] gsi soc:qcom,msm_gsi: gsi_stop_channel:2802 MSM8998 workaround for gsi firmware: avoiding kernel panic for already not allocated CH.
<3>[  126.882171] gsi soc:qcom,msm_gsi: gsi_reset_channel:2916 bad state 0
<3>[  126.882266] gsi soc:qcom,msm_gsi: gsi_reset_channel:2921 MSM8998 workaround for gsi firmware: avoiding kernel panic for already not allocated CH.
<3>[  126.882391] gsi soc:qcom,msm_gsi: gsi_dealloc_channel:3011 bad state 0
<3>[  126.882484] gsi soc:qcom,msm_gsi: gsi_dealloc_channel:3016 MSM8998 workaround for gsi firmware: avoiding kernel panic for already not allocated CH.
<3>[  126.882620] gsi soc:qcom,msm_gsi: gsi_dealloc_evt_ring:1729 1 channels still using this event ring
<6>[  126.882713] deall evt -7
<4>[  126.882764] ------------[ cut here ]------------
<4>[  126.882822] WARNING: CPU: 3 PID: 494 at /home/stefan/android/source/kernel/sony/msm-4.14/kernel/drivers/platform/msm/ipa/ipa_v3/ipa_dp.c:1446 ipa3_teardown_sys_pipe+0x10c4/0x10cc
[...]
<6>[  126.883167] Call trace:
<6>[  126.883171]  ipa3_teardown_sys_pipe+0x10c4/0x10cc
<6>[  126.883176]  ipa3_wwan_remove+0x98/0x540
<6>[  126.883182]  platform_drv_remove+0x24/0x4c
<6>[  126.883185]  device_release_driver_internal+0x130/0x1d4
<6>[  126.883187]  driver_detach+0x68/0xa4
<6>[  126.883191]  bus_remove_driver+0x74/0xa4
<6>[  126.883193]  driver_unregister+0x2c/0x4c
<6>[  126.883196]  platform_driver_unregister+0x10/0x18
<6>[  126.883199]  ipa3_ssr_notifier_cb+0x114/0x50c
<6>[  126.883205]  srcu_notifier_call_chain+0x5c/0xa0
<6>[  126.883207]  subsys_notif_queue_notification+0x28/0x30
<6>[  126.883210]  notify_each_subsys_device+0x25c/0x38c
<6>[  126.883213]  subsystem_restart_wq_func+0x208/0x6e0
<6>[  126.883217]  process_one_work+0x1fc/0x460
<6>[  126.883220]  worker_thread+0x2e8/0x44c
<6>[  126.883223]  kthread+0x118/0x130
<6>[  126.883227]  ret_from_fork+0x10/0x18
<4>[  126.883230] ---[ end trace d43303481cd86697 ]---
<3>[  126.883296] ipa-wan ipa3_wwan_remove:2662 Failed to teardown IPA->APPS pipe
<3>[  127.888180] gsi soc:qcom,msm_gsi: gsi_stop_channel:2796 chan=5 unexpected state=0
<3>[  127.888243] gsi soc:qcom,msm_gsi: gsi_stop_channel:2802 MSM8998 workaround for gsi firmware: avoiding kernel panic for already not allocated CH.
<3>[  127.890162] gsi soc:qcom,msm_gsi: gsi_reset_channel:2916 bad state 0
<3>[  127.890256] gsi soc:qcom,msm_gsi: gsi_reset_channel:2921 MSM8998 workaround for gsi firmware: avoiding kernel panic for already not allocated CH.
<3>[  127.890369] gsi soc:qcom,msm_gsi: gsi_dealloc_channel:3011 bad state 0
<3>[  127.890460] gsi soc:qcom,msm_gsi: gsi_dealloc_channel:3016 MSM8998 workaround for gsi firmware: avoiding kernel panic for already not allocated CH.
<6>[  127.890592] ipa-wan ipa3_wwan_remove:2673 rmnet_ipa unregister_netdev
<3>[  127.890645] ipa-wan __ipa_wwan_close:1057 [rmnet_ipa0]: ipa3_deregister_intf failed -22
<6>[  127.892557] rmnet_perf_config_notify_cb(): rmnet_perf netdevice unregister, name = rmnet_ipa0
<6>[  127.892562] rmnet_perf_config_notify_cb(): rmnet_perf netdevice unregister
<6>[  127.894082] rmnet_perf_config_notify_cb(): rmnet_perf netdevice unregister, name = rmnet_data0
<6>[  127.895822] rmnet_perf_config_notify_cb(): rmnet_perf netdevice unregister, name = rmnet_data1
<6>[  127.897721] rmnet_perf_config_notify_cb(): rmnet_perf netdevice unregister, name = rmnet_data2
<6>[  127.900001] rmnet_perf_config_notify_cb(): rmnet_perf netdevice unregister, name = rmnet_data3
<6>[  127.901965] rmnet_perf_config_notify_cb(): rmnet_perf netdevice unregister, name = rmnet_data4
<6>[  127.903752] rmnet_perf_config_notify_cb(): rmnet_perf netdevice unregister, name = rmnet_data5
<6>[  127.905990] rmnet_perf_config_notify_cb(): rmnet_perf netdevice unregister, name = rmnet_data6
<6>[  127.908136] rmnet_shs deinit rmnet_data7 going down 
<6>[  127.908227] rmnet_perf_config_notify_cb(): rmnet_perf netdevice unregister, name = rmnet_data7
<3>[  127.924441] ipa-wan ipa3_del_dflt_wan_rt_tables:445 Deleting Route hdl:(0x7) with ip type: 0
<3>[  127.924912] ipa-wan ipa3_del_dflt_wan_rt_tables:454 Deleting Route hdl:(0x9) with ip type: 1
<6>[  127.926698] ipa-wan ipa3_wwan_remove:2697 rmnet_ipa completed deinitialization
<6>[  127.926834] ipa-wan ipa3_ssr_notifier_cb:2877 IPA BEFORE_SHUTDOWN handling is complete
<6>[  127.926844] icnss: PD service down, pd_state: 2, state: 0x324d8f: cause: Root PD crashed
<6>[  127.926905] subsys-restart: subsystem_shutdown(): [kworker/u17:1:494]: Shutting down modem
<4>[  128.026966] pil-q6v5-mss 4080000.qcom,mss: Port 0000000000000000 halt timeout
<4>[  128.127119] pil-q6v5-mss 4080000.qcom,mss: Port 0000000000000000 halt timeout
<6>[  128.128523] gcc_mss_mnoc_bimc_axi_clk
<6>[  128.128526]                 CBCR: 0x00000000
<6>[  128.128528] gcc_mss_mnoc_bimc_axi_clk: status stuck at 'on'
<4>[  128.128562] ------------[ cut here ]------------
<4>[  128.128660] WARNING: CPU: 3 PID: 494 at /home/stefan/android/source/kernel/sony/msm-4.14/kernel/drivers/clk/qcom/clk-branch.c:116 clk_branch_toggle+0x12c/0x158
[...]
<6>[  128.128963] Call trace:
<6>[  128.128967]  clk_branch_toggle+0x12c/0x158
<6>[  128.128972]  clk_branch2_disable+0x18/0x20
<6>[  128.128977]  clk_core_disable+0xf4/0x1e0
<6>[  128.128979]  clk_disable+0x94/0x114
<6>[  128.128984]  pil_mss_disable_clks+0x1c/0x9c
<6>[  128.128986]  pil_mss_shutdown+0x26c/0x2b4
<6>[  128.128989]  pil_shutdown+0x3c/0x178
<6>[  128.128991]  modem_shutdown+0x88/0x9c
<6>[  128.128994]  subsystem_restart_wq_func+0x26c/0x6e0
<6>[  128.128999]  process_one_work+0x1fc/0x460
<6>[  128.129002]  worker_thread+0x2e8/0x44c
<6>[  128.129006]  kthread+0x118/0x130
<6>[  128.129009]  ret_from_fork+0x10/0x18
<4>[  128.129011] ---[ end trace d43303481cd86698 ]---
<6>[  128.146386] icnss: WLAN FW Service Disconnected: 0x324f8b
<6>[  128.146392] icnss_qmi: QMI Service Disconnected: 0x324f8b
<6>[  128.147051] service-notifier: service_notifier_del_server: Connection lost between QMI handle and 180 service
<6>[  128.147075] sysmon-qmi: ssctl_del_server: Connection lost between QMI handle and modem's SSCTL service
<6>[  128.158624] icnss: Collecting msa0 segment dump
<3>[  128.158631] Ramdump(ramdump_wcss_msa0): No consumers. Aborting..
<6>[  128.162140] ipa-wan ipa3_ssr_notifier_cb:2880 IPA Received MPSS AFTER_SHUTDOWN
<6>[  128.162667] ipa-wan ipa3_ssr_notifier_cb:2888 IPA AFTER_SHUTDOWN handling is complete
<4>[  128.162729] ------------[ cut here ]------------
<4>[  128.162793] WARNING: CPU: 3 PID: 494 at /home/stefan/android/source/kernel/sony/msm-4.14/kernel/drivers/soc/qcom/smem.c:652 qcom_smem_get+0x150/0x168
[...]
<6>[  128.163129] Call trace:
<6>[  128.163133]  qcom_smem_get+0x150/0x168
<6>[  128.163136]  microdump_modem_notifier_nb+0x64/0xec
<6>[  128.163141]  srcu_notifier_call_chain+0x5c/0xa0
<6>[  128.163145]  subsys_notif_queue_notification+0x28/0x30
<6>[  128.163147]  notify_each_subsys_device+0x25c/0x38c
<6>[  128.163150]  subsystem_restart_wq_func+0x2f8/0x6e0
<6>[  128.163154]  process_one_work+0x1fc/0x460
<6>[  128.163157]  worker_thread+0x2e8/0x44c
<6>[  128.163160]  kthread+0x118/0x130
<6>[  128.163163]  ret_from_fork+0x10/0x18
<4>[  128.163165] ---[ end trace d43303481cd86699 ]---
<6>[  128.163235] microdump_modem_notifier_nb: smem 611 not available
<6>[  128.164557] ipa-wan ipa3_ssr_notifier_cb:2891 IPA received MPSS BEFORE_POWERUP
<6>[  128.164969] ipa-wan ipa3_ssr_notifier_cb:2898 IPA BEFORE_POWERUP handling is complete
<6>[  128.164974] subsys-restart: subsystem_powerup(): [kworker/u17:1:494]: Powering up modem
<14>[  128.168679] ueventd: firmware: loading 'modem.mdt' for '/devices/platform/soc/4080000.qcom,mss/firmware/modem.mdt'
<6>[  128.169208] pil-q6v5-mss 4080000.qcom,mss: modem: loading from 0x000000008cc00000 to 0x0000000093c00000
<14>[  128.169569] ueventd: loading /devices/platform/soc/4080000.qcom,mss/firmware/modem.mdt took 1ms
<6>[  128.268612] pil-q6v5-mss 4080000.qcom,mss: MBA boot done
<14>[  128.507256] ueventd: firmware: loading 'modem.b02' for '/devices/platform/soc/4080000.qcom,mss/firmware/modem.b02'
<14>[  128.508534] ueventd: loading /devices/platform/soc/4080000.qcom,mss/firmware/modem.b02 took 1ms
[...]
<14>[  128.748824] ueventd: firmware: loading 'modem.b28' for '/devices/platform/soc/4080000.qcom,mss/firmware/modem.b28'
<14>[  128.752629] ueventd: loading /devices/platform/soc/4080000.qcom,mss/firmware/modem.b28 took 3ms
[...]
<6>[  129.781549] Sending QMI_IPA_INIT_MODEM_DRIVER_REQ_V01
<3>[  129.866163] ipa-wan ipa3_handle_indication_req:117 not send indication
<3>[  129.917052] Fatal error on the modem.
<3>[  129.917148] modem subsystem failure reason: ExIPC: Exception recieved tid=144 inst=0 badva=fef34fb0.
<6>[  129.917269] subsys-restart: subsystem_restart_dev(): Restart sequence requested for modem, restart_level = RELATED.
<6>[  129.921196] ipa-wan ipa3_ssr_notifier_cb:2860 IPA received MPSS BEFORE_SHUTDOWN
<6>[  129.924062] ipa-wan ipa3_wwan_remove:2658 rmnet_ipa started deinitialization
<4>[  129.931092] ------------[ cut here ]------------
<4>[  129.931205] WARNING: CPU: 7 PID: 494 at /home/stefan/android/source/kernel/sony/msm-4.14/kernel/drivers/iommu/dma-mapping-fast.c:656 fast_smmu_free+0x104/0x10c
[...]
<6>[  129.931417] Call trace:
<6>[  129.931419]  fast_smmu_free+0x104/0x10c
<6>[  129.931423]  ipa3_teardown_sys_pipe+0x93c/0x10cc
<6>[  129.931426]  ipa3_wwan_remove+0x98/0x540
<6>[  129.931430]  platform_drv_remove+0x24/0x4c
<6>[  129.931432]  device_release_driver_internal+0x130/0x1d4
<6>[  129.931434]  driver_detach+0x68/0xa4
<6>[  129.931435]  bus_remove_driver+0x74/0xa4
<6>[  129.931437]  driver_unregister+0x2c/0x4c
<6>[  129.931439]  platform_driver_unregister+0x10/0x18
<6>[  129.931440]  ipa3_ssr_notifier_cb+0x114/0x50c
<6>[  129.931444]  srcu_notifier_call_chain+0x5c/0xa0
<6>[  129.931446]  subsys_notif_queue_notification+0x28/0x30
<6>[  129.931447]  notify_each_subsys_device+0x25c/0x38c
<6>[  129.931449]  subsystem_restart_wq_func+0x208/0x6e0
<6>[  129.931452]  process_one_work+0x1fc/0x460
<6>[  129.931454]  worker_thread+0x2e8/0x44c
<6>[  129.931455]  kthread+0x118/0x130
<6>[  129.931458]  ret_from_fork+0x10/0x18
<4>[  129.931459] ---[ end trace d43303481cd8669a ]---
<1>[  129.931586] Unable to handle kernel paging request at virtual address ffffff8014bf2000
<1>[  129.931674] Mem abort info:
<1>[  129.931721]   Exception class = DABT (current EL), IL = 32 bits
<1>[  129.931809]   SET = 0, FnV = 0
<1>[  129.931856]   EA = 0, S1PTW = 0
<1>[  129.931902]   FSC = 7
<1>[  129.931949] Data abort info:
<1>[  129.932035]   ISV = 0, ISS = 0x00000047
<1>[  129.932082]   CM = 0, WnR = 1
<1>[  129.932130] swapper pgtable: 4k pages, 39-bit VAs, pgd = ffffff800a4e2000
<1>[  129.932217] [ffffff8014bf2000] *pgd=000000017e4be003, *pud=000000017e4be003, *pmd=0000000161688003, *pte=0000000000000000
<0>[  129.932308] Internal error: Oops: 96000047 [#1] PREEMPT SMP
<6>[  129.932356] Modules linked in:
<0>[  129.932444] Process kworker/u17:1 (pid: 494, stack limit = 0xffffffc0ef7b8000)
<6>[  129.932492] CPU: 7 PID: 494 Comm: kworker/u17:1 Tainted: G S      W       4.14.189-24379-g205a380bd16a-dirty #1
[...]
<6>[  129.935229] Call trace:
<6>[  129.935277]  __memset+0x1ac/0x1c8
<6>[  129.935326]  ipa3_teardown_sys_pipe+0xa2c/0x10cc
<6>[  129.935414]  ipa3_wwan_remove+0x98/0x540
<6>[  129.935462]  platform_drv_remove+0x24/0x4c
<6>[  129.935551]  device_release_driver_internal+0x130/0x1d4
<6>[  129.935600]  driver_detach+0x68/0xa4
<6>[  129.935648]  bus_remove_driver+0x74/0xa4
<6>[  129.935736]  driver_unregister+0x2c/0x4c
<6>[  129.935785]  platform_driver_unregister+0x10/0x18
<6>[  129.935832]  ipa3_ssr_notifier_cb+0x114/0x50c
<6>[  129.935921]  srcu_notifier_call_chain+0x5c/0xa0
<6>[  129.935970]  subsys_notif_queue_notification+0x28/0x30
<6>[  129.936065]  notify_each_subsys_device+0x25c/0x38c
<6>[  129.936114]  subsystem_restart_wq_func+0x208/0x6e0
<6>[  129.936221]  process_one_work+0x1fc/0x460
<6>[  129.936269]  worker_thread+0x2e8/0x44c
<6>[  129.936316]  kthread+0x118/0x130
<6>[  129.936364]  ret_from_fork+0x10/0x18
<0>[  129.936453] Code: 91010108 54ffff4a 8b040108 cb050042 (d50b7428) 
<4>[  129.936500] ---[ end trace d43303481cd8669b ]---

I can't say too much, interesting to me is that it starts loading modem.b02 but not modem.b00 and modem.b01, but maybe that is just expected?

The logfiles: console-ramoops-0.log dmesg-ramoops-0.log dmesg-ramoops-1.log pmsg-ramoops-0.log


¹) The kernel commit can't be found on github:

²) SELinux: enforcing

kholk commented 4 years ago

That's definitely better than before: now it's going way further and crashing in the teardown path of the main IPA pipe. Let's see what can I do...

stefanhh0 commented 4 years ago

Cool, happy to see that we are getting closer :-) Nice iterative approach that is :+1:

stefanhh0 commented 4 years ago

An update I have tried to reproduce the issue using kernel 4.14.190 and LA.UM.7.1.r1-16600-sm8150.0. Result is that I could still reproduce the issue after several reboots. This time I have a full kernel.log from before the failure + the pstore contents:

kernel log from bootup until the crash occurred: kernel.log

pstore: console-ramoops-0.log dmesg-ramoops-0.log dmesg-ramoops-1.log pmsg-ramoops-0.log

To me it looks pretty similar than what I have seen in the previous crash that I have posted in the comment: https://github.com/sonyxperiadev/bug_tracker/issues/580#issuecomment-665092135.

kholk commented 4 years ago

I agree, that's the same 1:1 crash...

kholk commented 4 years ago

https://github.com/sonyxperiadev/kernel/pull/2331 is solving kernel panics due to modem crashes. Sensors are also getting restarted when this happens, so this entire stack of issues should be fully resolved.

Awaiting confirmation (let's say, 1 week should be enough!).

stefanhh0 commented 4 years ago

Since the fix is already merged back and since we have been able to confirm that the situation has been improved a lot it is okay to close the issue.

The outcome is that even though the modem crashes shortly after the phone boots, the phone is able to recover from the crash, which wasn't the case before. So the root cause for the crashing modem in the first place remains open but the solution is good enough and the problem does anyway only occur once in a while.