sonyxperiadev / bug_tracker

Empty repository that is used as a bugtracker for Open Devices project
52 stars 13 forks source link

[4.19][AOSP11][pdx203/Edo]: cnss: Failed to set MHI state: RESUME(7) ret: -110 blocks suspend #735

Closed Thaodan closed 1 year ago

Thaodan commented 2 years ago

Platform: Edo Device: Xperia 1 II/PDX203 Kernel version: 4.19 Android version: 11.0.0_r46 Software binaries version: v8a

Previously working on No idea

Description When the device tries to enter sleep mhi_pm_fast_resume fails blocking suspend:

[60754.527935] PM: suspend entry (deep)
[60754.527941] PM: Syncing filesystems ... done.
[60754.532152] Freezing user space processes ... 
[60754.542064] Freezing of tasks aborted after 0.009 seconds
[60754.542131] OOM killer enabled.
[60754.542132] Restarting tasks ... done.
[60754.545644] thermal thermal_zone91: failed to read out thermal zone (-19)
[60754.546106] PM: suspend exit
[60754.546109] PM: suspend entry (deep)
[60754.546110] PM: Syncing filesystems ... done.
[60754.597761] Freezing user space processes ... (elapsed 0.010 seconds) done.
[60754.608401] OOM killer disabled.
[60754.608403] Freezing remaining freezable tasks ... (elapsed 0.002 seconds) done.
[60754.611223] Suspending console(s) (use no_console_suspend to debug)
[60754.720359] PM: Some devices failed to suspend, or early wake event detected
[60754.722088] [E][mhi_pm_fast_resume] Unable to access EP Config space
[60754.722094] cnss: Failed to set MHI state: RESUME(7) ret: -110

Symptoms

How to reproduce Unplug the device from power or trigger suspend by writing freeze into /sys/power/state.

Additional context Full dmesg: dmesg.cnss.mhi_suspend.log

Kernel commits tried

Thaodan commented 2 years ago

Disabling the wlan module removes assert error, however thermal thermal_zone91: failed to read out thermal zone (-19 and so on are still there. dmesg.edo.wlan.module.disabled.log

Thaodan commented 2 years ago

Xiamo has this hack on their Stock kernel: https://github.com/Official-Ayrton990/android_kernel_xiaomi_sm8250/commit/d851606ed5f779ee06a2d71b0aa9d0e0b6f52e58#diff-5ba1b9f5eb0722357f3d1023dbbec082b722e29405be3b91082c455b9e69dc26R310

I'm not exactly sure but it could help if SDX55 is the culprit of the issue again.

Thaodan commented 2 years ago

From what I read device pci device 0002:01:00.0 is failing, I see in logcat that there are tries to setup a device in /dev/ with a name including that pci device address Same for errors in kmsg like this: " [E][mhi_device_get_sync_atomic] Did not enter M0 state, cur_state:M3_FAST pm_state:M3". The pci data (17cb:0306) also shows that the device on that bus should be SDX55.

Thaodan commented 2 years ago

I triggered the watchdog while switching mobile data on and off. Mobile data doesn't work but there's 4G connection. console-0-ramopps.18.04.2022.log dmesg-ramopps.18.04.2022.log

Thaodan commented 2 years ago

Not any data connection works. Calls work fine thou (didn't test VoLTE).

Thaodan commented 2 years ago

After using the stock roms cnss driver I get this messages now:

[   21.935539] sysmon-qmi: ssctl_new_server: Connection established between QMI handle and adsp's SSCTL service
[   22.043286] sysmon-qmi: ssctl_new_server: Connection established between QMI handle and slpi's SSCTL service
[   24.694341] sysmon-qmi: ssctl_new_server: Connection established between QMI handle and esoc0's SSCTL service

EDIT: I mistook something, they were there before.

Thaodan commented 2 years ago

Reverting https://github.com/sonyxperiadev/kernel/pull/2433 which created issues in #731 didn't help.

Thaodan commented 2 years ago

Reflashed stock and then reflashed my OS. Mobile data works now but it still doesn't suspend. EDIT: Mobile Data only works when booting up with power connected.

746 seems to be a duplicate of this.

Thaodan commented 2 years ago

@Haxk20 Did you test if your Xperia 1 II suspends after https://github.com/sonyxperiadev/kernel/pull/2438?

Haxk20 commented 2 years ago

Pdx213 is my daily. It works fine.

On Wed, Jun 29, 2022, 4:21 PM Bjรถrn Bidar @.***> wrote:

@Haxk20 https://github.com/Haxk20 Did you test if your Xperia 1 II suspends after sonyxperiadev/kernel#2438 https://github.com/sonyxperiadev/kernel/pull/2438?

โ€” Reply to this email directly, view it on GitHub https://github.com/sonyxperiadev/bug_tracker/issues/735#issuecomment-1170046541, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD7WP7ROC7I3UTIMJUZX2ADVRRLXJANCNFSM5KDJ6DTA . You are receiving this because you were mentioned.Message ID: @.***>

Thaodan commented 2 years ago

Pdx213 is my daily. It works fine.

@Haxk20: PDX213 is Lena/Xperia 10 III.

Haxk20 commented 2 years ago

Pdx213 is my daily. It works fine.

@Haxk20: PDX213 is Lena/Xperia 10 III.

My bad. Xperia 1II is my daily. (Bit too hot today to think clearly)

Thaodan commented 2 years ago

Am Mittwoch, 29. Juni 2022, 17:50:37 EEST schrieb Martin Botka:

Pdx213 is my daily. It works fine.

@Haxk20: PDX213 is Lena/Xperia 10 III.

My bad. Xperia 1II is my daily. (Bit too hot today to think clearly)

Can you check if it suspends?

Or if it doesn't because of:

[ 614.424638] cnss: Failed to set MHI state: SUSPEND(6) ret: -22 [ 614.440879] pci_pm_suspend(): cnss_pci_suspend+0x0/0x124 returns -11 [ 614.441016] dpm_run_callback(): pci_pm_suspend+0x0/0x21c returns -11 [ 614.441087] PM: Device 0000:01:00.0 failed to suspend: error -11 [ 614.441204] PM: Some devices failed to suspend, or early wake event detected [ 614.444647] sec_ts 5-0048: [sec_input] Before: ??? [ 614.444765] sec_ts 5-0048: [sec_input] After: ??? [ 614.448934] OOM killer enabled. [ 614.448938] Restarting tasks ... done. [ 614.455577] thermal thermal_zone91: failed to read out thermal zone (-19) [ 614.456134] PM: suspend exit [ 614.456148] PM: suspend entry (deep) [ 614.456150] PM: Syncing filesystems ... done.

Haxk20 commented 2 years ago

Am Mittwoch, 29. Juni 2022, 17:50:37 EEST schrieb Martin Botka:

Pdx213 is my daily. It works fine. > > @Haxk20: PDX213 is Lena/Xperia 10 III. My bad. Xperia 1II is my daily. (Bit too hot today to think clearly) Can you check if it suspends? Or if it doesn't because of: [ 614.424638] cnss: Failed to set MHI state: SUSPEND(6) ret: -22 [ 614.440879] pci_pm_suspend(): cnss_pci_suspend+0x0/0x124 returns -11 [ 614.441016] dpm_run_callback(): pci_pm_suspend+0x0/0x21c returns -11 [ 614.441087] PM: Device 0000:01:00.0 failed to suspend: error -11 [ 614.441204] PM: Some devices failed to suspend, or early wake event detected [ 614.444647] sec_ts 5-0048: [sec_input] Before: ??? [ 614.444765] sec_ts 5-0048: [sec_input] After: ??? [ 614.448934] OOM killer enabled. [ 614.448938] Restarting tasks ... done. [ 614.455577] thermal thermal_zone91: failed to read out thermal zone (-19) [ 614.456134] PM: suspend exit [ 614.456148] PM: suspend entry (deep) [ 614.456150] PM: Syncing filesystems ... done.

0seconds of deep sleep over 100 hours of uptime. So that is no it doesnt suspend.

Thaodan commented 2 years ago

Am Mittwoch, 29. Juni 2022, 17:50:37 EEST schrieb Martin Botka:

Pdx213 is my daily. It works fine. > > @Haxk20: PDX213 is Lena/Xperia 10 III. My bad. Xperia 1II is my daily. (Bit too hot today to think clearly) Can you check if it suspends? Or if it doesn't because of: [ 614.424638] cnss: Failed to set MHI state: SUSPEND(6) ret: -22 [ 614.440879] pci_pm_suspend(): cnss_pci_suspend+0x0/0x124 returns -11 [ 614.441016] dpm_run_callback(): pci_pm_suspend+0x0/0x21c returns -11 [ 614.441087] PM: Device 0000:01:00.0 failed to suspend: error -11 [ 614.441204] PM: Some devices failed to suspend, or early wake event detected [ 614.444647] sec_ts 5-0048: [sec_input] Before: ??? [ 614.444765] sec_ts 5-0048: [sec_input] After: ??? [ 614.448934] OOM killer enabled. [ 614.448938] Restarting tasks ... done. [ 614.455577] thermal thermal_zone91: failed to read out thermal zone (-19) [ 614.456134] PM: suspend exit [ 614.456148] PM: suspend entry (deep) [ 614.456150] PM: Syncing filesystems ... done.

0seconds of deep sleep over 100 hours of uptime. So that is no it doesnt suspend.

Can you also check if you get the same spam in kmsg?

Haxk20 commented 2 years ago

Am Mittwoch, 29. Juni 2022, 17:50:37 EEST schrieb Martin Botka:

Pdx213 is my daily. It works fine. > > @Haxk20: PDX213 is Lena/Xperia 10 III. My bad. Xperia 1II is my daily. (Bit too hot today to think clearly) Can you check if it suspends? Or if it doesn't because of: [ 614.424638] cnss: Failed to set MHI state: SUSPEND(6) ret: -22 [ 614.440879] pci_pm_suspend(): cnss_pci_suspend+0x0/0x124 returns -11 [ 614.441016] dpm_run_callback(): pci_pm_suspend+0x0/0x21c returns -11 [ 614.441087] PM: Device 0000:01:00.0 failed to suspend: error -11 [ 614.441204] PM: Some devices failed to suspend, or early wake event detected [ 614.444647] sec_ts 5-0048: [sec_input] Before: ??? [ 614.444765] sec_ts 5-0048: [sec_input] After: ??? [ 614.448934] OOM killer enabled. [ 614.448938] Restarting tasks ... done. [ 614.455577] thermal thermal_zone91: failed to read out thermal zone (-19) [ 614.456134] PM: suspend exit [ 614.456148] PM: suspend entry (deep) [ 614.456150] PM: Syncing filesystems ... done.

0seconds of deep sleep over 100 hours of uptime. So that is no it doesnt suspend.

Can you also check if you get the same spam in kmsg?

Could you give me when it should happen ? Android is a spamming x ๐Ÿ˜‚ and it will overflow the buffer

Thaodan commented 2 years ago

Can you also check if you get the same spam in kmsg?

Could you give me when it should happen ? Android is a spamming x ๐Ÿ˜‚ and it will overflow the buffer

Oh yeah Android doing that very much..

To replicate do this:

  1. Boot
  2. Unlock Sim
  3. Suspend
  4. Wait 1-2s
  5. Get logs

If you do it via ssh or adb tcp it should be easier to get the logs.

Haxk20 commented 2 years ago

Can you also check if you get the same spam in kmsg?

Could you give me when it should happen ? Android is a spamming x ๐Ÿ˜‚ and it will overflow the buffer

Oh yeah Android doing that very much..

To replicate do this:

  1. Boot
  2. Unlock Sim
  3. Suspend
  4. Wait 1-2s
  5. Get logs

If you do it via ssh or adb tcp it should be easier to get the logs.

Will do later tomorrow or later.

Thaodan commented 2 years ago

Can you also check if you get the same spam in kmsg?

Could you give me when it should happen ? Android is a spamming x joy and it will overflow the buffer

Oh yeah Android doing that very much.. To replicate do this:

  1. Boot
  2. Unlock Sim
  3. Suspend
  4. Wait 1-2s
  5. Get logs

If you do it via ssh or adb tcp it should be easier to get the logs.

Will do later tomorrow or later.

@Haxk20 Please also check if mobile data works when booting without a charger/power connected.

Haxk20 commented 2 years ago

Can you also check if you get the same spam in kmsg?

Could you give me when it should happen ? Android is a spamming x joy and it will overflow the buffer

Oh yeah Android doing that very much.. To replicate do this:

  1. Boot
  2. Unlock Sim
  3. Suspend
  4. Wait 1-2s
  5. Get logs

If you do it via ssh or adb tcp it should be easier to get the logs.

Will do later tomorrow or later.

@Haxk20 Please also check if mobile data works when booting without a charger/power connected.

It does.

Have to still check the logs.

Thaodan commented 2 years ago

Am Samstag, 2. Juli 2022, 04:59:16 EEST schrieb Martin Botka:

Can you also check if you get the same spam in kmsg?

Could you give me when it should happen ? Android is a spamming x joy and it will overflow the buffer

Oh yeah Android doing that very much.. To replicate do this:

  1. Boot
  2. Unlock Sim
  3. Suspend
  4. Wait 1-2s
  5. Get logs

If you do it via ssh or adb tcp it should be easier to get the logs.

Will do later tomorrow or later.

@Haxk20 Please also check if mobile data works when booting without a charger/power connected. It does.

So you can bootup fresh and connect to mobile data on battery? Which stock rom did you had installed before flashing AOSP?

For me on SailfishOS mobile data doesn't work if I don't connect to the charger, VoLTE should be similar.

I suspect the modem not booting up properly

Thaodan commented 2 years ago

Linux localhost 4.19.122-ge9de8f60ecb1-dirty #1 SMP PREEMPT Sat Aug 20 11:31:42 EEST 2022 aarch64

Seems better or different, I don't know still something blocks the suspend

pdx203:/ $ dmesg|grep -i mhi                                                                                                                                                                                                                        
[    0.791764] cnss: Assign MSI to user: MHI, num_vectors: 3, user_base_data: 800, base_vector: 0
[    0.791765] cnss: Number of assigned MSI for MHI is 3, base vector is 0
[    0.791770] cnss: Unable to create CNSS MHI IPC log context
[    0.791773] cnss: Unable to create CNSS MHICNTRL IPC log context
[   14.617252] cnss: Setting MHI state: INIT(0)
[   14.617317] cnss: Setting MHI state: POWER_ON(2)
[   14.847439] cnss: MHI status cb is called with reason MISSION_MODE(6)
[   14.847460] cnss: Notify MHI to use already allocated images
[   15.239318] platform soc:qcom,msm-audio-apr:qcom,voice-mhi-audio: assigned reserved memory node mailbox_region
[   15.758436] mhi 0002:01:00.0: Linked as a consumer to 15000000.apps-smmu
[   15.768012] register_client_adhoc:Client handle 15 mhi0
[   15.769473] mhi 0002:01:00.0: BAR 0: assigned [mem 0x64300000-0x64300fff 64bit]
[   15.769517] mhi 0002:01:00.0: enabling device (0000 -> 0002)
[   15.770748] mhi 0002:01:00.0: Falling back to syfs fallback for: sdx55m/sbl1.mbn
[   17.884212] mhi 0002:01:00.0: enabling device (0000 -> 0002)
[   17.915170] netif_napi_add() called with weight 128 on device rmnet_mhi%d
[   17.915513] rmnet_perf_config_notify_cb(): rmnet_perf netdevice register, name = rmnet_mhi0
[   19.852797] type=1400 audit(1645532484.300:331): avc: denied { open } for comm="netmgrd" path="/sys/devices/platform/soc/1c10000.qcom,pcie/pci0002:00/0002:00:00.0/0002:01:00.0/0306_02.01.00_IP_HW0/net/rmnet_mhi0/queues/rx-0/rps_cpus" dev="sysfs" ino=90126 scontext=u:r:netmgrd:s0 tcontext=u:object_r:sysfs:s0 tclass=file permissive=1
[   19.852860] type=1400 audit(1645532484.300:332): avc: denied { getattr } for comm="netmgrd" path="/sys/devices/platform/soc/1c10000.qcom,pcie/pci0002:00/0002:00:00.0/0002:01:00.0/0306_02.01.00_IP_HW0/net/rmnet_mhi0/queues/rx-0/rps_cpus" dev="sysfs" ino=90126 scontext=u:r:netmgrd:s0 tcontext=u:object_r:sysfs:s0 tclass=file permissive=1
[   19.963140] cnss: Setting MHI state: POWER_OFF(4)
[   19.987810] cnss: Setting MHI state: DEINIT(1)
[   24.998231] type=1400 audit(1645532489.446:347): avc: denied { read } for comm="qti" name="mhi_uci" dev="sysfs" ino=89832 scontext=u:r:qti:s0 tcontext=u:object_r:sysfs:s0 tclass=dir permissive=1
[   24.998344] type=1400 audit(1645532489.446:348): avc: denied { open } for comm="qti" path="/sys/devices/platform/soc/1c10000.qcom,pcie/pci0002:00/0002:00:00.0/0002:01:00.0/0306_02.01.00_QMI1/mhi_uci" dev="sysfs" ino=89832 scontext=u:r:qti:s0 tcontext=u:object_r:sysfs:s0 tclass=dir permissive=1
[   24.998545] type=1400 audit(1645532489.446:349): avc: denied { read write } for comm="qti" name="mhi_0306_02.01.00_pipe_16" dev="tmpfs" ino=48872 scontext=u:r:qti:s0 tcontext=u:object_r:mhi_device:s0 tclass=chr_file permissive=1
[   24.998708] type=1400 audit(1645532489.446:350): avc: denied { open } for comm="qti" path="/dev/mhi_0306_02.01.00_pipe_16" dev="tmpfs" ino=48872 scontext=u:r:qti:s0 tcontext=u:object_r:mhi_device:s0 tclass=chr_file permissive=1
Thaodan commented 2 years ago

logcat.edo.4.19.122.log dmesg.edo.4.19.122.log

Thaodan commented 2 years ago

I went back to Linux localhost 4.19.113-g0a6664ccb816-dirty, the issue is the same here.

pdx203:/ $ dmesg|grep -i mhi -A1 -B1                                                                                                                                                                                                                
[    0.792846] cnss: BAR starts at 0x0000000060300000, len-100000
[    0.792848] cnss: Assign MSI to user: MHI, num_vectors: 3, user_base_data: 800, base_vector: 0
[    0.792849] cnss: Number of assigned MSI for MHI is 3, base vector is 0
[    0.792849] cnss: Get IRQ number 430 for vector index 0
--
[    0.792850] cnss: Get IRQ number 432 for vector index 2
[    0.792851] cnss: Unable to create CNSS MHI IPC log context
[    0.792855] cnss: Unable to create CNSS MHICNTRL IPC log context
[    0.792952] iommu: Adding device ab00000.qcom,cvp:cvp_secure_nonpixel_cb to group 18
--
[    5.201235] cnss: QMI timeout is 10000 ms
[    5.201238] cnss: Setting MHI state: INIT(0)
[    5.201308] cnss: Setting MHI state: POWER_ON(2)
[    5.201576] cnss_pci 0000:01:00.0: Falling back to syfs fallback for: qca6390/amss20.bin
--
[    5.371980] init: Control message: Could not find 'android.hardware.audio@6.0::IDevicesFactory/default' for ctl.interface_start from pid: 617 (/system/bin/hwservicemanager)
[    5.373042] cnss: MHI status cb is called with reason MISSION_MODE(6)
[    5.373063] cnss: Notify MHI to use already allocated images
[    5.373071] cnss: PM relax, state: 0x8, count: 0
--
[    6.142429] ueventd: loading /devices/platform/soc/1c00000.qcom,pcie/pci0000:00/0000:00:00.0/0000:01:00.0/firmware/qca6390!m3.bin took 1ms
[    6.143459] platform soc:qcom,msm-audio-apr:qcom,voice-mhi-audio: assigned reserved memory node mailbox_region
[    6.143883] cnss: Sending M3 information message, state: 0xb
--
[    6.301479] iommu: Adding device 0002:00:00.0 to group 54
[    6.302281] mhi 0002:01:00.0: Linked as a consumer to 15000000.apps-smmu
[    6.307296] CAM_INFO: CAM-SENSOR: cam_sensor_driver_cmd: 799 Probe success,slot:2,slave_addr:0x20,sensor_id:0x900
--
[    6.316000] register_client_adhoc:find path.src 108 dest 512
[    6.316288] register_client_adhoc:Client handle 15 mhi0
[    6.317733] mhi 0002:01:00.0: BAR 0: assigned [mem 0x64300000-0x64300fff 64bit]
[    6.317768] pci-msm-rc 0002:00:00.0: enabling device (0000 -> 0003)
[    6.317779] mhi 0002:01:00.0: enabling device (0000 -> 0002)
[    6.319190] mhi 0002:01:00.0: Falling back to syfs fallback for: sdx55m/sbl1.mbn
[    6.326904] ueventd: firmware: loading 'sdx55m/sbl1.mbn' for '/devices/platform/soc/1c10000.qcom,pcie/pci0002:00/0002:00:00.0/0002:01:00.0/firmware/sdx55m!sbl1.mbn'
--
[    8.405191] healthd: battery l=100 v=4323 t=29.5 h=2 st=2 c=-716308 fc=3955000 cc=0 chg=u
[    8.406222] mhi 0002:01:00.0: enabling device (0000 -> 0002)
[    8.428870] QDSS diag bridge: Open work failed with err:0
[    8.443927] netif_napi_add() called with weight 128 on device rmnet_mhi%d
[    8.444308] rmnet_perf_config_notify_cb(): rmnet_perf netdevice register, name = rmnet_mhi0
[    8.455146] qdss_bridge: probe of 0306_02.01.00_IP_HW_QDSS failed with error -22
--
[   10.369658] init: Service 'vendor.ims_rtp_daemon' (pid 923) received signal 9
[   10.378577] type=1400 audit(1645532484.851:440): avc: denied { open } for comm="netmgrd" path="/sys/devices/platform/soc/1c10000.qcom,pcie/pci0002:00/0002:00:00.0/0002:01:00.0/0306_02.01.00_IP_HW0/net/rmnet_mhi0/queues/rx-0/rps_cpus" dev="sysfs" ino=89912 scontext=u:r:netmgrd:s0 tcontext=u:object_r:sysfs:s0 tclass=file permissive=1
[   10.378672] type=1400 audit(1645532484.851:441): avc: denied { getattr } for comm="netmgrd" path="/sys/devices/platform/soc/1c10000.qcom,pcie/pci0002:00/0002:00:00.0/0002:01:00.0/0306_02.01.00_IP_HW0/net/rmnet_mhi0/queues/rx-0/rps_cpus" dev="sysfs" ino=89912 scontext=u:r:netmgrd:s0 tcontext=u:object_r:sysfs:s0 tclass=file permissive=1
[   10.407683] init: Sending signal 9 to service 'vendor.imsrcsservice' (pid 930) process group...
--
[   10.664849] cnss: Write val 0x3 to register offset 0x1f8031c
[   10.666121] cnss: Setting MHI state: POWER_OFF(4)
[   10.681010] logd: logdr: UID=1000 GID=1000 PID=2235 n tail=50 logMask=8 pid=2226 start=0ns timeout=0ns
--
[   10.691166] msm_pcie_disable: PCIe: Assert the reset of endpoint of RC0.
[   10.691816] cnss: Setting MHI state: DEINIT(1)
[   10.691907] cnss: De-assert WLAN_EN GPIO successfully
--
[   15.715363] libprocessgroup: Successfully killed process cgroup uid 1000 pid 2753 in 0ms
[   15.725528] type=1400 audit(1645532490.213:478): avc: denied { read } for comm="qti" name="mhi_uci" dev="sysfs" ino=89604 scontext=u:r:qti:s0 tcontext=u:object_r:sysfs:s0 tclass=dir permissive=1
[   15.725653] type=1400 audit(1645532490.213:479): avc: denied { open } for comm="qti" path="/sys/devices/platform/soc/1c10000.qcom,pcie/pci0002:00/0002:00:00.0/0002:01:00.0/0306_02.01.00_QMI1/mhi_uci" dev="sysfs" ino=89604 scontext=u:r:qti:s0 tcontext=u:object_r:sysfs:s0 tclass=dir permissive=1
[   15.725805] type=1400 audit(1645532490.213:480): avc: denied { read write } for comm="qti" name="mhi_0306_02.01.00_pipe_16" dev="tmpfs" ino=45399 scontext=u:r:qti:s0 tcontext=u:object_r:mhi_device:s0 tclass=chr_file permissive=1
[   15.725902] type=1400 audit(1645532490.213:481): avc: denied { open } for comm="qti" path="/dev/mhi_0306_02.01.00_pipe_16" dev="tmpfs" ino=45399 scontext=u:r:qti:s0 tcontext=u:object_r:mhi_device:s0 tclass=chr_file permissive=1
[   16.272736] init: Control message: Could not find 'android.hardware.radio@1.5::IRadio/slot1' for ctl.interface_start from pid: 617 (/system/bin/hwservi
Thaodan commented 2 years ago

Stock (58.1.A.5.55B)

The thermal thermal_zone91: failed to read out thermal zone (-19) is normal, stock has those too:

[  544.462138] thermal thermal_zone99: failed to read out thermal zone (-19)
[  544.462148] thermal thermal_zone101: failed to read out thermal zone (-19)
[  544.462155] thermal thermal_zone102: failed to read out thermal zone (-19)
[  544.462162] thermal thermal_zone103: failed to read out thermal zone (-19)
[  544.462169] thermal thermal_zone104: failed to read out thermal zone (-19)
[  544.462176] thermal thermal_zone105: failed to read out thermal zone (-19)
[  544.462208] Resume caused by IRQ 454, dma-grant

Suspend on stock looks like this:

[  435.150456] Freezing user space processes ... 
[  435.153220] spcom:handle_poll: interrupted wait retval=0
[  435.155488] spcom:handle_poll: interrupted wait retval=0
[  435.176394] (elapsed 0.025 seconds) done.
[  435.176403] OOM killer disabled.
[  435.176405] Freezing remaining freezable tasks ... (elapsed 0.003 seconds) done.
[  435.179672] Suspending console(s) (use no_console_suspend to debug)
[  435.184867] [kworke][0x1fd2a9b96][13:48:00.795130] wlan: [268:I:HDD] hdd_suspend_wlan: 1243: WLAN being suspended by OS
[  435.184955] [kworke][0x1fd2aa257][13:48:00.795220] wlan: [268:I:PMO] pmo_core_is_wow_applicable: 319: lpass enabled, enabling wow
[  435.189206] [Binder][0x1fd2be102][13:48:00.799469] wlan: [10444:I:HDD] __wlan_hdd_bus_suspend: 1035: starting bus suspend
[  435.189259] [Binder][0x1fd2be527][13:48:00.799524] wlan: [10444:I:PMO] pmo_core_enable_wow_in_fw: 804: drv wow is enabled
[  435.189279] [Binder][0x1fd2be6a6][13:48:00.799544] wlan: [10444:I:WMI] suspend type: WOW_IFACE_PAUSE_ENABLED
[  435.220915] [Binder][0x1fd352a7c][13:48:00.831169] wlan: [10444:I:HIF] hif_try_complete_tasks: 773: waiting for tasklets 0 grp tasklets 0 work 2
[  435.236972] [Binder][0x1fd39de2d][13:48:00.847220] wlan: [10444:I:HDD] __wlan_hdd_bus_suspend: 1114: bus suspend succeeded
[  435.236987] cnss: Setting MHI state: SUSPEND(6)
[  435.237025] cnss: Suspending PCI link
[  435.237029] cnss: Use PCIe DRV suspend
[  435.262195] Disabling non-boot CPUs ...
[  435.264415] IRQ 11: no longer affine to CPU1
[  435.264420] IRQ 17: no longer affine to CPU1
[  435.264438] IRQ 94: no longer affine to CPU1
[  435.264445] IRQ 101: no longer affine to CPU1
[  435.264450] IRQ 106: no longer affine to CPU1
[  435.264488] IRQ 255: no longer affine to CPU1
[  435.264494] IRQ 263: no longer affine to CPU1
[  435.264500] IRQ 271: no longer affine to CPU1
[  435.264505] IRQ 275: no longer affine to CPU1
[  435.264509] IRQ 279: no longer affine to CPU1
[  435.264750] CPU1: shutdown
[  435.264806] psci: CPU1 killed (polled 0 ms)
[  435.268728] CPU2: shutdown
[  435.268771] psci: CPU2 killed (polled 0 ms)
[  435.271685] CPU3: shutdown
[  435.271725] psci: CPU3 killed (polled 0 ms)
[  435.274407] CPU4: shutdown
[  435.275533] psci: CPU4 killed (polled 0 ms)
[  435.277805] CPU5: shutdown
[  435.278931] psci: CPU5 killed (polled 0 ms)
[  435.281424] CPU6: shutdown
[  435.282560] psci: CPU6 killed (polled 0 ms)
[  435.284892] CPU7: shutdown
[  435.284969] psci: CPU7 killed (polled 0 ms)
[  435.286349] suspend ns:     435286343827\x09suspend cycles:       8544343741
[  435.286343] resume cycles:       9788802299
[  435.286433] pm_system_irq_wakeup: 37 triggered pm8xxx_rtc_alarm
[  435.286698] Enabling non-boot CPUs ...
[  435.287321] Detected VIPT I-cache on CPU1
[  435.287390] arch_timer: CPU1: Trapping CNTVCT access
[  435.287403] CPU1: Booted secondary processor 0x0000000100 [0x51df805e]
[  435.288944] CPU1 is up
[  435.289495] Detected VIPT I-cache on CPU2
[  435.289529] arch_timer: CPU2: Trapping CNTVCT access
[  435.289537] CPU2: Booted secondary processor 0x0000000200 [0x51df805e]
[  435.290339] CPU2 is up
[  435.290748] Detected VIPT I-cache on CPU3
[  435.290780] arch_timer: CPU3: Trapping CNTVCT access
[  435.290787] CPU3: Booted secondary processor 0x0000000300 [0x51df805e]
[  435.291549] CPU3 is up
[  435.292558] Detected PIPT I-cache on CPU4
[  435.292598] arch_timer: CPU4: Trapping CNTVCT access
[  435.292606] CPU4: Booted secondary processor 0x0000000400 [0x411fd0d0]
[  435.293247] CPU4 is up
[  435.293782] Detected PIPT I-cache on CPU5
[  435.293821] arch_timer: CPU5: Trapping CNTVCT access
[  435.293829] CPU5: Booted secondary processor 0x0000000500 [0x411fd0d0]
[  435.294451] CPU5 is up
[  435.294958] Detected PIPT I-cache on CPU6
[  435.294999] arch_timer: CPU6: Trapping CNTVCT access
[  435.295007] CPU6: Booted secondary processor 0x0000000600 [0x411fd0d0]
[  435.295625] CPU6 is up
[  435.296142] Detected PIPT I-cache on CPU7
[  435.296181] arch_timer: CPU7: Trapping CNTVCT access
[  435.296188] CPU7: Booted secondary processor 0x0000000700 [0x411fd0d0]
[  435.296733] CPU7 is up
[  435.308801] pci-msm-rc 0000:00:00.0: Refused to change power state, currently in D3
[  435.308816] pci-msm-rc 0002:00:00.0: Refused to change power state, currently in D3
[  435.311921] cnss: Resuming PCI link
[  435.312992] cnss: Set PCI link status to: 2
[  435.313010] cnss: Setting MHI state: RESUME(7)
[  435.317204] [Binder][0x2477e4efa][13:49:05.743016] wlan: [10444:I:HDD] wlan_hdd_bus_resume: 1252: starting bus resume
[  435.317502] [Binder][0x2477e658b][13:49:05.743316] wlan: [10444:I:PMO] pmo_core_psoc_send_host_wakeup_ind_to_fw: 1201: Host wakeup indication sent to fw
[  435.331040] [soft_i][0x247825ccc][13:49:05.756853] wlan: [0:F:WMA] Non-WLAN triggered wakeup: UNSPECIFIED (-1)
[  435.331158] [Binder][0x2478265be][13:49:05.756972] wlan: [10444:I:HDD] wlan_hdd_bus_resume: 1317: bus resume succeeded
[  435.331246] [kworke][0x247826c5e][13:49:05.757061] wlan: [90:I:HDD] hdd_resume_wlan: 1301: WLAN being resumed by OS
[  435.331276] [kworke][0x247826eab][13:49:05.757091] wlan: [90:I:HDD] hdd_disable_host_offloads: 707: vdev is not connected
[  435.331313] [kworke][0x247827171][13:49:05.757128] wlan: [90:I:HDD] hdd_disable_host_offloads: 702: offload is not supported on this vdev opmode: 7
[  435.340193] OOM killer enabled.
[  435.340197] Restarting tasks ... 
[  435.354794] smcinvoke: process_accept_req: process_accept_req txn 1 either invalid or removed from Q
[  435.354810] smcinvoke: process_accept_req: accept thread returning with ret: -11
[  435.356210] done.
[  435.356401] thermal thermal_zone99: failed to read out thermal zone (-19)
[  435.356419] thermal thermal_zone101: failed to read out thermal zone (-19)
[  435.356427] thermal thermal_zone102: failed to read out thermal zone (-19)
[  435.356435] thermal thermal_zone103: failed to read out thermal zone (-19)
[  435.356441] thermal thermal_zone104: failed to read out thermal zone (-19)
[  435.356448] thermal thermal_zone105: failed to read out thermal zone (-19)
[  435.356487] Resume caused by IRQ 37, pm8xxx_rtc_alarm
[  435.356540] PM: suspend exit
[  435.397680] ## sony_ext_uim_ctrl_gpio_set_value: gpio=1147 value=1
[  435.592217] PM: suspend entry (deep)
[  435.592224] PM: Syncing filesystems ... done.
[  435.599267] Freezing user space processes ... (elapsed 0.009 seconds) done.
[  435.609116] OOM killer disabled.
[  435.609118] Freezing remaining freezable tasks ... (elapsed 0.003 seconds) done.
[  435.612264] Suspending console(s) (use no_console_suspend to debug)
[  435.617375] [kworke][0x247d63fef][13:49:06.043188] wlan: [10716:I:HDD] hdd_suspend_wlan: 1243: WLAN being suspended by OS
[  435.617449] [kworke][0x247d645b0][13:49:06.043265] wlan: [10716:I:PMO] pmo_core_is_wow_applicable: 319: lpass enabled, enabling wow
[  435.621570] [Binder][0x247d77a91][13:49:06.047383] wlan: [10444:I:HDD] __wlan_hdd_bus_suspend: 1035: starting bus suspend
[  435.621634] [Binder][0x247d77ebc][13:49:06.047439] wlan: [10444:I:PMO] pmo_core_enable_wow_in_fw: 804: drv wow is enabled
[  435.621654] [Binder][0x247d78102][13:49:06.047469] wlan: [10444:I:WMI] suspend type: WOW_IFACE_PAUSE_ENABLED
[  435.641167] PM: Last active Wakeup Source: 0306_02.01.00
[  435.652309] [Binder][0x247e07c0b][13:49:06.078123] wlan: [10444:I:HDD] __wlan_hdd_bus_suspend: 1114: bus suspend succeeded
[  435.652317] cnss: Setting MHI state: SUSPEND(6)
[  435.652347] cnss: Suspending PCI link
[  435.652352] cnss: Use PCIe DRV suspend
[  435.653363] PM: Some devices failed to suspend, or early wake event detected
[  435.653540] cnss: Resuming PCI link
[  435.654844] cnss: Set PCI link status to: 2
[  435.654859] cnss: Setting MHI state: RESUME(7)
[  435.655042] [Binder][0x247e148fe][13:49:06.080856] wlan: [10444:I:HDD] wlan_hdd_bus_resume: 1252: starting bus resume
[  435.655232] [Binder][0x247e1547e][13:49:06.081009] wlan: [10444:I:PMO] pmo_core_psoc_send_host_wakeup_ind_to_fw: 1201: Host wakeup indication sent to fw
[  435.668456] pci-msm-rc 0002:00:00.0: Refused to change power state, currently in D3
[  435.677411] [soft_i][0x247e7d587][13:49:06.103211] wlan: [0:F:WMA] Non-WLAN triggered wakeup: UNSPECIFIED (-1)
[  435.677778] [Binder][0x247e7f22d][13:49:06.103591] wlan: [10444:I:HDD] wlan_hdd_bus_resume: 1317: bus resume succeeded
[  435.677853] [kworke][0x247e7f7e4][13:49:06.103668] wlan: [270:I:HDD] hdd_resume_wlan: 1301: WLAN being resumed by OS
[  435.677888] [kworke][0x247e7fa96][13:49:06.103704] wlan: [270:I:HDD] hdd_disable_host_offloads: 707: vdev is not connected
[  435.677961] [kworke][0x247e80006][13:49:06.103776] wlan: [270:I:HDD] hdd_disable_host_offloads: 702: offload is not supported on this vdev opmode: 7
[  435.689683] OOM killer enabled.
[  435.689685] Restarting tasks ... done.
[  435.700061] thermal thermal_zone99: failed to read out thermal zone (-19)
[  435.700073] thermal thermal_zone101: failed to read out thermal zone (-19)
[  435.700079] thermal thermal_zone102: failed to read out thermal zone (-19)
[  435.700084] thermal thermal_zone103: failed to read out thermal zone (-19)
[  435.700090] thermal thermal_zone104: failed to read out thermal zone (-19)
[  435.700095] thermal thermal_zone105: failed to read out thermal zone (-19)
[  435.700118] Abort: Last active Wakeup Source: 0306_02.01.00
[  435.700208] PM: suspend exit
[  435.742504] ## sony_ext_uim_ctrl_gpio_set_value: gpio=1147 value=1
[  435.901069] PM: suspend entry (deep)
[  435.901076] PM: Syncing filesystems ... done.

Sometimes suspend fails:

``` [ 447.292281] Freezing user space processes ... (elapsed 0.019 seconds) done. [ 447.311657] OOM killer disabled. [ 447.311661] Freezing remaining freezable tasks ... (elapsed 0.005 seconds) done. [ 447.316749] Suspending console(s) (use no_console_suspend to debug) [ 447.324180] [kworke][0x2553bfa1b][13:49:17.749991] wlan: [270:I:HDD] hdd_suspend_wlan: 1243: WLAN being suspended by OS [ 447.324336] pci_pm_suspend(): mhi_system_suspend+0x0/0x2c0 returns -16 [ 447.324347] dpm_run_callback(): pci_pm_suspend+0x0/0x200 returns -16 [ 447.324350] [kworke][0x2553c06f6][13:49:17.750162] wlan: [270:I:PMO] pmo_core_is_wow_applicable: 319: lpass enabled, enabling wow [ 447.324360] PM: Device 0002:01:00.0 failed to suspend async: error -16 [ 447.324673] PM: Some devices failed to suspend, or early wake event detected [ 447.324770] [kworke][0x2553c2681][13:49:17.750583] wlan: [270:I:HDD] hdd_resume_wlan: 1301: WLAN being resumed by OS [ 447.324804] [kworke][0x2553c2921][13:49:17.750618] wlan: [270:I:HDD] hdd_disable_host_offloads: 707: vdev is not connected [ 447.324850] [kworke][0x2553c2ca0][13:49:17.750664] wlan: [270:I:HDD] hdd_disable_host_offloads: 702: offload is not supported on this vdev opmode: 7 [ 447.330586] OOM killer enabled. [ 447.330590] Restarting tasks ... done. [ 447.340975] thermal thermal_zone99: failed to read out thermal zone (-19) [ 447.340992] thermal thermal_zone101: failed to read out thermal zone (-19) [ 447.340999] thermal thermal_zone102: failed to read out thermal zone (-19) [ 447.341006] thermal thermal_zone103: failed to read out thermal zone (-19) [ 447.341013] thermal thermal_zone104: failed to read out thermal zone (-19) [ 447.341020] thermal thermal_zone105: failed to read out thermal zone (-19) [ 447.341050] Abort: Callback failed on 0002:01:00.0 in pci_pm_suspend+0x0/0x200 returned -16 [ 447.341171] PM: suspend exit [ 447.386110] ## sony_ext_uim_ctrl_gpio_set_value: gpio=1147 value=1 [ 450.277968] rpmh_rsc_send_data: 30 callbacks suppressed [ 450.277983] qcom_rpmh DRV:apps_rsc TCS Busy, retrying RPMH message send: addr=0x30000 [ 450.278009] qcom_rpmh DRV:apps_rsc TCS Busy, retrying RPMH message send: addr=0x30000 [ 450.278026] qcom_rpmh DRV:apps_rsc TCS Busy, retrying RPMH message send: addr=0x30000 [ 450.278042] qcom_rpmh DRV:apps_rsc TCS Busy, retrying RPMH message send: addr=0x30000 [ 450.278057] qcom_rpmh DRV:apps_rsc TCS Busy, retrying RPMH message send: addr=0x30000 [ 450.278073] qcom_rpmh DRV:apps_rsc TCS Busy, retrying RPMH message send: addr=0x30000 [ 450.278089] qcom_rpmh DRV:apps_rsc TCS Busy, retrying RPMH message send: addr=0x30000 [ 450.278104] qcom_rpmh DRV:apps_rsc TCS Busy, retrying RPMH message send: addr=0x30000 [ 450.278120] qcom_rpmh DRV:apps_rsc TCS Busy, retrying RPMH message send: addr=0x30000 [ 450.278136] qcom_rpmh DRV:apps_rsc TCS Busy, retrying RPMH message send: addr=0x30000 [ 453.742669] PM: suspend entry (deep) [ 453.742683] PM: Syncing filesystems ... done. [ 453.766559] Freezing user space processes ... (elapsed 0.016 seconds) done. [ 453.782987] OOM killer disabled. [ 453.782989] Freezing remaining freezable tasks ... (elapsed 0.003 seconds) done. [ 453.786578] Suspending console(s) (use no_console_suspend to debug) [ 453.792454] [kworke][0x25ca2f9ef][13:49:24.218255] wlan: [268:I:HDD] hdd_suspend_wlan: 1243: WLAN being suspended by OS [ 453.792556] [kworke][0x25ca30298][13:49:24.218370] wlan: [268:I:PMO] pmo_core_is_wow_applicable: 319: lpass enabled, enabling wow [ 453.797951] [Binder][0x25ca496fd][13:49:24.223762] wlan: [10444:I:HDD] __wlan_hdd_bus_suspend: 1035: starting bus suspend [ 453.798019] [Binder][0x25ca49c52][13:49:24.223833] wlan: [10444:I:PMO] pmo_core_enable_wow_in_fw: 804: drv wow is enabled [ 453.798060] [Binder][0x25ca49e3e][13:49:24.223859] wlan: [10444:I:WMI] suspend type: WOW_IFACE_PAUSE_ENABLED [ 453.833479] [Binder][0x25caeff2d][13:49:24.259285] wlan: [10444:I:HDD] __wlan_hdd_bus_suspend: 1114: bus suspend succeeded [ 453.833543] cnss: Setting MHI state: SUSPEND(6) [ 453.833576] cnss: Suspending PCI link [ 453.833580] cnss: Use PCIe DRV suspend [ 453.843556] dpm_run_callback(): platform_pm_suspend+0x0/0x58 returns -16 [ 453.843568] PM: Device alarmtimer failed to suspend: error -16 [ 453.843575] PM: Some devices failed to suspend, or early wake event detected [ 453.859947] pci-msm-rc 0002:00:00.0: Refused to change power state, currently in D3 [ 453.859951] pci-msm-rc 0000:00:00.0: Refused to change power state, currently in D3 [ 453.860189] cnss: Resuming PCI link [ 453.861441] cnss: Set PCI link status to: 2 [ 453.861462] cnss: Setting MHI state: RESUME(7) [ 453.865023] [Binder][0x25cb83d6d][13:49:24.290835] wlan: [10444:I:HDD] wlan_hdd_bus_resume: 1252: starting bus resume [ 453.865289] [Binder][0x25cb8517f][13:49:24.291102] wlan: [10444:I:PMO] pmo_core_psoc_send_host_wakeup_ind_to_fw: 1201: Host wakeup indication sent to fw [ 453.879294] [soft_i][0x25cbc663a][13:49:24.305034] wlan: [0:F:WMA] Non-WLAN triggered wakeup: UNSPECIFIED (-1) [ 453.879768] [Binder][0x25cbc8f65][13:49:24.305581] wlan: [10444:I:HDD] wlan_hdd_bus_resume: 1317: bus resume succeeded [ 453.880027] [kworke][0x25cbca2c1][13:49:24.305839] wlan: [383:I:HDD] hdd_resume_wlan: 1301: WLAN being resumed by OS [ 453.880072] [kworke][0x25cbca63f][13:49:24.305886] wlan: [383:I:HDD] hdd_disable_host_offloads: 707: vdev is not connected [ 453.880139] [kworke][0x25cbcaa38][13:49:24.305939] wlan: [383:I:HDD] hdd_disable_host_offloads: 702: offload is not supported on this vdev opmode: 7 [ 453.892295] OOM killer enabled. [ 453.892298] Restarting tasks ... done. ```

dmesg

dmesg.edo.stock.log

dmesg.stock.grep.mhi.log

Thaodan commented 2 years ago

4.19.134-g5e4ba15d432f-dirty / [5e4ba15d432f5e4a170ab126a05ef1b32a1cdffe] treewide: Linux 4.19.134 - BAD

Backtrace from suspend:

[  122.893432] cnss: Resuming PCI link
[  122.894501] cnss: Set PCI link status to: 2
[  122.894512] cnss: Setting MHI state: RESUME(7)
[  122.894519] cnss: PCIe link is suspended
[  122.894524] [E][mhi_pm_fast_resume] Unable to access EP Config space
[  122.894530] cnss: Failed to set MHI state: RESUME(7) ret: -110
[  122.917601] OOM killer enabled.
[  122.917605] Restarting tasks ... done.
[  122.940458] thermal thermal_zone91: failed to read out thermal zone (-19)
[  122.943225] PM: suspend exit
[  122.960017] type=1400 audit(1661086349.317:644): avc: denied { read } for comm="Binder:645_1" name="wakeup58" dev="sysfs" ino=74013 scontext=u:r:system_suspend:s0 tcontext=u:object_r:sysfs:s0 tclass=dir permissive=1
[  122.960111] type=1400 audit(1661086349.317:645): avc: denied { open } for comm="Binder:645_1" path="/sys/devices/virtual/misc/msm_wma/wakeup58" dev="sysfs" ino=74013 scontext=u:r:system_suspend:s0 tcontext=u:object_r:sysfs:s0 tclass=dir permissive=1
[  122.960167] type=1400 audit(1661086349.318:646): avc: denied { read } for comm="Binder:645_1" name="event_count" dev="sysfs" ino=74020 scontext=u:r:system_suspend:s0 tcontext=u:object_r:sysfs:s0 tclass=file permissive=1
[  122.960228] type=1400 audit(1661086349.318:647): avc: denied { open } for comm="Binder:645_1" path="/sys/devices/virtual/misc/msm_wma/wakeup58/event_count" dev="sysfs" ino=74020 scontext=u:r:system_suspend:s0 tcontext=u:object_r:sysfs:s0 tclass=file permissive=1
[  122.960285] type=1400 audit(1661086349.318:648): avc: denied { getattr } for comm="Binder:645_1" path="/sys/devices/virtual/misc/msm_wma/wakeup58/event_count" dev="sysfs" ino=74020 scontext=u:r:system_suspend:s0 tcontext=u:object_r:sysfs:s0 tclass=file permissive=1
[  122.966166] type=1400 audit(1661086349.324:649): avc: denied { read } for comm="Binder:645_1" name="wakeup36" dev="sysfs" ino=67012 scontext=u:r:system_suspend:s0 tcontext=u:object_r:sysfs_msm_subsys:s0 tclass=dir permissive=1
[  122.966209] audit: audit_lost=272 audit_rate_limit=5 audit_backlog_limit=64
[  122.966220] audit: rate limit exceeded
[  122.968494] type=1400 audit(1661086349.324:650): avc: denied { open } for comm="Binder:645_1" path="/sys/devices/platform/soc/a94000.i2c/i2c-5/5-0048/wakeup/wakeup36" dev="sysfs" ino=67012 scontext=u:r:system_suspend:s0 tcontext=u:object_r:sysfs_msm_subsys:s0 tclass=dir permissive=1
[  122.968560] type=1400 audit(1661086349.324:651): avc: denied { read } for comm="Binder:645_1" name="event_count" dev="sysfs" ino=67019 scontext=u:r:system_suspend:s0 tcontext=u:object_r:sysfs_msm_subsys:s0 tclass=file permissive=1
[  122.968617] type=1400 audit(1661086349.324:652): avc: denied { open } for comm="Binder:645_1" path="/sys/devices/platform/soc/a94000.i2c/i2c-5/5-0048/wakeup/wakeup36/event_count" dev="sysfs" ino=67019 scontext=u:r:system_suspend:s0 tcontext=u:object_r:sysfs_msm_subsys:s0 tclass=file permissive=1
[  123.025491] init: Control message: Could not find 'com.qualcomm.qti.uceservice@2.2::IUceService/com.qualcomm.qti.uceservice' for ctl.interface_start from pid: 612 (/system/bin/hwservicemanager)
[  123.043515] PM: suspend entry (deep)
[  123.043520] PM: Syncing filesystems ... done.
[  123.047091] Freezing user space processes ... (elapsed 0.005 seconds) done.
[  123.052914] OOM killer disabled.
[  123.052916] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
[  123.054856] Suspending console(s) (use no_console_suspend to debug)
[  123.089158] cnss: Cannot set MHI state SUSPEND(6) in current MHI state (0x45)
[  123.089161] cnss: ASSERT at line 1054
[  123.089162] ------------[ cut here ]------------
[  123.089204] WARNING: CPU: 2 PID: 3438 at /home/bidar/dev/android/kernel/sony/msm-4.19/kernel/drivers/net/wireless/cnss2/pci.c:1054 cnss_pci_set_mhi_state+0x2d4/0x558
[  123.089206] Modules linked in:
[  123.089212] CPU: 2 PID: 3438 Comm: Binder:645_1 Tainted: G S      W         4.19.134-g5e4ba15d432f-dirty #5
[  123.089214] Hardware name: Sony Mobile Communications. PDX-203(KONA) (DT)
[  123.089218] pstate: 60400005 (nZCv daif +PAN -UAO)
[  123.089222] pc : cnss_pci_set_mhi_state+0x2d4/0x558
[  123.089225] lr : cnss_pci_set_mhi_state+0x2d4/0x558
[  123.089226] sp : ffffff801dceb7a0
[  123.089227] x29: ffffff801dceb7e0 x28: ffffffafb48de000 
[  123.089231] x27: ffffffafb4b54000 x26: ffffffafb4b54000 
[  123.089233] x25: 0000000000000006 x24: ffffffafb4aa3000 
[  123.089236] x23: 0000000000000002 x22: ffffffe7769a3a70 
[  123.089239] x21: ffffffafb39f07a0 x20: 0000000000000006 
[  123.089241] x19: ffffffe7769a3880 x18: 00000000fff6f2c8 
[  123.089244] x17: 000000000016f2c8 x16: 0000000000000044 
[  123.089246] x15: 000000000016f2c8 x14: 0000000000000000 
[  123.089249] x13: 0000000000000006 x12: 0000000000000000 
[  123.089251] x11: 0000000000000000 x10: ffffffffffffffff 
[  123.089254] x9 : 7f54b899522f6700 x8 : 7f54b899522f6700 
[  123.089256] x7 : 0000000000000000 x6 : ffffffe7797d0d6c 
[  123.089259] x5 : 0000000000000000 x4 : 000000000000000e 
[  123.089261] x3 : 0000000000002d2d x2 : 0000000000000001 
[  123.089263] x1 : 0000000000000000 x0 : 0000000000000024 
[  123.089269] Call trace:
[  123.089272] cnss_pci_set_mhi_state+0x2d4/0x558
[  123.089277] cnss_pci_suspend_bus+0x28/0x168
[  123.089282] cnss_pci_suspend+0xb4/0x148
[  123.089288] pci_pm_suspend+0x12c/0x1fc
[  123.089293] dpm_run_callback+0x130/0x250
[  123.089296] __device_suspend+0x3b0/0x740
[  123.089298] dpm_suspend+0x17c/0x560
[  123.089300] dpm_suspend_start+0x6c/0x78
[  123.089306] suspend_devices_and_enter+0xb4/0xa2c
[  123.089309] pm_suspend+0x584/0x714
[  123.089312] state_store+0x100/0x138
[  123.089318] kobj_attr_store+0x14/0x24
[  123.089323] sysfs_kf_write+0x38/0x48
[  123.089325] kernfs_fop_write+0x134/0x1d0
[  123.089331] __vfs_write+0x44/0x144
[  123.089333] vfs_write+0xe0/0x19c
[  123.089335] ksys_write+0x6c/0xcc
[  123.089337] __arm64_sys_write+0x18/0x20
[  123.089344] el0_svc_common+0x98/0x160
[  123.089346] el0_svc_handler+0x60/0x78
[  123.089351] el0_svc+0x8/0x14c
[  123.089353] ---[ end trace 30ffd85e1eb813b1 ]---
[  123.089357] cnss: Failed to set MHI state: SUSPEND(6) ret: -22
[  123.108841] pci_pm_suspend(): cnss_pci_suspend+0x0/0x148 returns -11
[  123.108849] dpm_run_callback(): pci_pm_suspend+0x0/0x1fc returns -11
[  123.108854] PM: Device 0000:01:00.0 failed to suspend: error -11
[  123.108860] PM: Some devices failed to suspend, or early wake event detected
[  123.119988] pci-msm-rc 0002:00:00.0: Refused to change power state, currently in D3
[  123.121306] qcom_rpmh DRV:apps_rsc TCS Busy, retrying RPMH message send: addr=0x30000
[  123.125116] OOM killer enabled.
[  123.125118] Restarting tasks ... done.
[  123.131211] thermal thermal_zone91: failed to read out thermal zone (-19)
[  123.131295] PM: suspend exit
[  123.332015] PM: suspend entry (deep)
[  123.332022] PM: Syncing filesystems ... done.
Thaodan commented 2 years ago

I seems that here the first suspend works and then the resume fails.

Thaodan commented 2 years ago

Commits missing in stock (58.1.A.5.55B):

Thaodan commented 2 years ago

After reverting https://github.com/sonyxperiadev/kernel/commit/16517fc0bdf1a2810865a9d86b2043212fd15223 suspend sometime fails in the same way as in stock as linked above:

[  132.839908] Freezing user space processes ... (elapsed 0.007 seconds) done.
[  132.847641] OOM killer disabled.
[  132.847644] Freezing remaining freezable tasks ... (elapsed 0.002 seconds) done.
[  132.850154] Suspending console(s) (use no_console_suspend to debug)
[  132.904347] dpm_run_callback(): platform_pm_suspend+0x0/0x6c returns -16
[  132.904353] PM: Device alarmtimer failed to suspend: error -16
[  132.904360] PM: Some devices failed to suspend, or early wake event detected
[  132.933296] OOM killer enabled.
[  132.933299] Restarting tasks ... done.
[  132.941997] thermal thermal_zone91: failed to read out thermal zone (-19)
[  132.942295] PM: suspend exit
[  134.953806] PM: suspend entry (deep)
[  134.953812] PM: Syncing filesystems ... done.
[  134.962530] Freezing user space processes ... (elapsed 0.009 seconds) done.
[  134.971896] OOM killer disabled.
[  134.971900] Freezing remaining freezable tasks ... (elapsed 0.002 seconds) done.
[  134.974638] Suspending console(s) (use no_console_suspend to debug)
[  135.051827] Disabling non-boot CPUs ...

Sometimes the error looks different:

[  386.159145] Freezing user space processes ... 
[  386.166303] Freezing of tasks aborted after 0.007 seconds
[  386.166468] OOM killer enabled.
[  386.166470] Restarting tasks ... done.
[  386.172534] thermal thermal_zone91: failed to read out thermal zone (-19)
[  386.172748] PM: suspend exit
[  386.574293] PM: suspend entry (deep)
[  386.574299] PM: Syncing filesystems ... done.
[  386.576164] Freezing user space processes ... (elapsed 0.007 seconds) done.
[  386.583699] OOM killer disabled.
[  386.583702] Freezing remaining freezable tasks ... (elapsed 0.002 seconds) done.
[  386.586312] Suspending console(s) (use no_console_suspend to debug)
[  386.656967] pci_pm_suspend_noirq(): cnss_pci_suspend_noirq+0x0/0x48 returns -11
[  386.656979] dpm_run_callback(): pci_pm_suspend_noirq+0x0/0x2e4 returns -11
[  386.657020] PM: Device 0000:01:00.0 failed to suspend noirq: error -11
[  386.669629] PM: noirq suspend of devices failed
[  386.714524] OOM killer enabled.
[  386.714527] Restarting tasks ... done.
[  386.723879] thermal thermal_zone91: failed to read out thermal zone (-19)
[  386.724299] PM: suspend exit
Thaodan commented 2 years ago

Reverting https://github.com/sonyxperiadev/kernel/commit/16517fc0bdf1a2810865a9d86b2043212fd15223 which broke Edo/SM8250 and SDX55M suspend and add upstream commits which are missing was the only solution I could find.

https://github.com/sonyxperiadev/kernel/commit/16517fc0bdf1a2810865a9d86b2043212fd15223 is not present in any other tree, LA.UM.9.14.r1 has the same driver without the commit and K.P.1.0 has a reworked mainline version of the driver.

https://github.com/sonyxperiadev/kernel/pull/2531

jerpelea commented 2 years ago

@Thaodan Thanks for the time spent on this issue.

Thaodan commented 2 years ago

Added some more changes that were tested during this bug: https://github.com/sonyxperiadev/kernel/pull/2532

Thaodan commented 1 year ago

Added some more changes that were tested during this bug: sonyxperiadev/kernel#2532

@jerpelea Should we still merge those? These commits should help to suspend the cnss module correctly.

jerpelea commented 1 year ago

cherry-picked @Thaodan thanks for the reminder