map220v / sm8150-mainline

WIP Mainline kernel for Xiaomi Pad 5
Other
29 stars 15 forks source link

Does the drm suspend/wakeup works? #12

Open catmengi opened 3 months ago

catmengi commented 3 months ago

Does the drm/dsi suspend works, can i suspend my screen and the wakeup it without any crash and fatal for UX bugs. One important question, does hibernate to disk works? Eg, no crashes after resuming from image

map220v commented 3 months ago

Unfortunately it's broken on sm8150 with dual dsi since kernel 6.2, when dsi suspends or resumes it gives errors like: clock stuck in on/off state, on kernel 6.1 without this issue, there is no UX, OpenGL and Vulkan context crashes or bugs when suspending or resuming.

If linux hibernate doesn't require uefi varstore then it should work same way as on regular linux PC.

catmengi commented 3 months ago

Unfortunately it's broken on sm8150 with dual dsi since kernel 6.2, when dsi suspends or resumes it gives errors like: clock stuck in on/off state, on kernel 6.1 without this issue, there is no UX, OpenGL and Vulkan context crashes or bugs when suspending or resuming.

If linux hibernate doesn't require uefi varstore then it should work same way as on regular linux PC.

Does it mean kernel will panic on resume/suspend? I sawed commit for drm resume/suspend fix https://github.com/maverickjb/linux-6.1.10/commit/7c923106caddb4e08e63696f29fee062ab99d107 from maverickjb, only difference is gdsc.c, i updated it in my fork, but same as you i cant test it now :(

map220v commented 3 months ago

First dsi will give critical errors, then it will reboot because of unstable clock driver or dsi driver state. I think gdsc.c issues were already fixed in older kernels, right now dsi issue seem to be releated to devlink and dsi cyclic dependencies.

catmengi commented 3 months ago

First dsi will give critical errors, then it will reboot because of unstable clock driver or dsi driver state. I think gdsc.c issues were already fixed in older kernels, right now dsi issue seem to be releated to devlink and dsi cyclic dependencies.

Well, if screen is suspended it will crash on resume, any ideas of fixing this? Did you have kernel log of crash? What parts of kernel/dts is related to dsi clocks, devlink and dsi cyclic dependencies? How can i debug my tablet's kernel via usb, without dissambling it and any external hardware?

Edit: https://github.com/torvalds/linux/commit/9187ebb954ab2afe0e79e0ff7771e94d3d1d9e1c https://github.com/torvalds/linux/commit/d09ec6f9877798a2a66c9d5de524b419e2c064bb https://github.com/torvalds/linux/commits/master/drivers/gpu/drm/msm/dsi maybe this commits can be related to our problem? Or just try to downgrade dsi drivers?

map220v commented 3 months ago

First dsi will give critical errors, then it will reboot because of unstable clock driver or dsi driver state. I think gdsc.c issues were already fixed in older kernels, right now dsi issue seem to be releated to devlink and dsi cyclic dependencies.

Well, if screen is suspended it will crash on resume, any ideas of fixing this? Did you have kernel log of crash? What parts of kernel/dts is related to dsi clocks, devlink and dsi cyclic dependencies? How can i debug my tablet's kernel via usb, without dissambling it and any external hardware?

I don't have kernel logs from mainline, you can check this code it controls dsi clocks. For devlink and it's cyclic dependecies handling check latest nabu-6.0-rc1 commits. Idk if linux has support for usb debugging, dwc3 driver on linux probably doesn't support that.

Edit: torvalds/linux@9187ebb torvalds/linux@d09ec6f https://github.com/torvalds/linux/commits/master/drivers/gpu/drm/msm/dsi maybe this commits can be related to our problem? Or just try to downgrade dsi drivers?

These commits doesn't seem to be releated to this issue. Downgrading dsi most likely won't help, because issue seem to be somewhere else, this error "clock stuck in on/off state" is also happens for UFS on boot, I fixed it temporarly by adding sleep functions in clock enable/disable, but it seems that ufs driver still has chance to crash at boot.

catmengi commented 3 months ago

Now only left two annoing parts, waiting for bootloader unlock and debugging. I wont close this issue until, i/you/we found a solution for this, if you can give me more info about this it may be very helpful

catmengi commented 3 months ago

How to get kernel crash log for debugging if screen is black(or it isnt?), does it save logs or dump files, if it does, where are they?

map220v commented 3 months ago

How to get kernel crash log for debugging if screen is black(or it isnt?), does it save logs or dump files, if it does, where are they?

I use ssh to get logs from dmesg, there is also ramoops(pstore) it saves kernel logs to ddr region at address 0xb0000000 that is persistent between reboots.

catmengi commented 3 months ago

How to get kernel crash log for debugging if screen is black(or it isnt?), does it save logs or dump files, if it does, where are they?

I use ssh to get logs from dmesg, there is also ramoops(pstore) it saves kernel logs to ddr region at address 0xb0000000 that is persistent between reboots.

But how to get thoose logs from there?

map220v commented 3 months ago

How to get kernel crash log for debugging if screen is black(or it isnt?), does it save logs or dump files, if it does, where are they?

I use ssh to get logs from dmesg, there is also ramoops(pstore) it saves kernel logs to ddr region at address 0xb0000000 that is persistent between reboots.

But how to get thoose logs from there?

Boot to android recovery and check /sys/fs/pstore/console-ramoops* Also If there is no pstore compression enabled, you can read 0xb0001000-0xb0021000 from /dev/mem there should be tool for that.

catmengi commented 3 months ago

How to get kernel crash log for debugging if screen is black(or it isnt?), does it save logs or dump files, if it does, where are they?

I use ssh to get logs from dmesg, there is also ramoops(pstore) it saves kernel logs to ddr region at address 0xb0000000 that is persistent between reboots.

But how to get thoose logs from there?

Boot to android recovery and check /sys/fs/pstore/console-ramoops* Also If there is no pstore compression enabled, you can read 0xb0001000-0xb0021000 from /dev/mem there should be tool for that.

If i boot from linux to fastboot, then to orangefox recovery it wont be erased? If kernel panic isnt permanent (not everytime) can i just view logs from linux?

map220v commented 3 months ago

How to get kernel crash log for debugging if screen is black(or it isnt?), does it save logs or dump files, if it does, where are they?

I use ssh to get logs from dmesg, there is also ramoops(pstore) it saves kernel logs to ddr region at address 0xb0000000 that is persistent between reboots.

But how to get thoose logs from there?

Boot to android recovery and check /sys/fs/pstore/console-ramoops* Also If there is no pstore compression enabled, you can read 0xb0001000-0xb0021000 from /dev/mem there should be tool for that.

If i boot from linux to fastboot, then to orangefox recovery it wont be erased? If kernel panic isnt permanent (not everytime) can i just view logs from linux?

It won't be erased, but some kernel crashes can corrupt pstore header, and pstore in recovery won't show console-ramoops, so the only way to read such log would be /dev/mem If after kernel panic you can still access ssh or any other console, then use dmesg command, it will print kernel logs. Also for more logs from dsi use this "drm.debug=0x1ff" and "log_buf_len=8M" in kernel cmdline.

catmengi commented 3 months ago

Dsi_phy_7nm.c have big differences beetwen versions 6.1 to 6.7, maybe just some patch broke it for nabu🤔

map220v commented 3 months ago

Dsi_phy_7nm.c have big differences beetwen versions 6.1 to 6.7, maybe just some patch broke it for nabu🤔

elish with sm8250 also has Dual DSI, but dsi suspend and resume seem to work fine there, and they don't have clocks issues with UFS, so it's probably something wrong with clocks or clocks dependency handling on sm8150. Without this hack UFS has around 30% chance to start without clock error, maybe same applies to dsi clocks, because suspend and resume worked for me 2 or 3 times in 6.6, or maybe it didn't fully suspended. Also disabling one of these nodes "dsi1, dsi0 or dispcc" also fixes ufs clocks issue.

catmengi commented 3 months ago

Dsi_phy_7nm.c have big differences beetwen versions 6.1 to 6.7, maybe just some patch broke it for nabu🤔

elish with sm8250 also has Dual DSI, but dsi suspend and resume seem to work fine there, and they don't have clocks issues with UFS, so it's probably something wrong with clocks or clocks dependency handling on sm8150. Without this hack UFS has around 30% chance to start without clock error, maybe same applies to dsi clocks, because suspend and resume worked for me 2 or 3 times in 6.6, or maybe it didn't fully suspended. Also disabling one of these nodes "dsi1, dsi0 or dispcc" also fixes ufs clocks issue.

May this be because patches with devlink "optimizing"? And what dsi1 and dsi0 does(where the screen lives)? And which part of kernel control thoose "clocks". Nabu android kernel dont have this issue, can we get code from there and rework?

catmengi commented 3 months ago

Which part of the kernel control this "clocks"?

map220v commented 3 months ago

Dsi_phy_7nm.c have big differences beetwen versions 6.1 to 6.7, maybe just some patch broke it for nabu🤔

elish with sm8250 also has Dual DSI, but dsi suspend and resume seem to work fine there, and they don't have clocks issues with UFS, so it's probably something wrong with clocks or clocks dependency handling on sm8150. Without this hack UFS has around 30% chance to start without clock error, maybe same applies to dsi clocks, because suspend and resume worked for me 2 or 3 times in 6.6, or maybe it didn't fully suspended. Also disabling one of these nodes "dsi1, dsi0 or dispcc" also fixes ufs clocks issue.

May this be because patches with devlink "optimizing"? And what dsi1 and dsi0 does(where the screen lives)? And which part of kernel control thoose "clocks". Nabu android kernel dont have this issue, can we get code from there and rework?

DSI is a protocol that usually uses four data lines and one clock line to transfer dsi commands and video data to display controller(nt36523), we have dual dsi configuration that uses two DSI controllers for one display, in this configuration DSI0 usually used to draw left half of display and DSI1 right half. It would be easier to use 6.1 kernel code, because everything works fine there, but I already tested all changes in drivers/clk/qcom and drivers/gpu/drm/msm/dsi/phy folder, and im pretty sure that clocks driver and dsi driver is fine in 6.2-6.7, devlink changes in 6.2 probably causes drivers to start in different order and that somehow affects stability of sm8150 clocks.

Which part of the kernel control this "clocks"?

Display clocks specified in https://github.com/map220v/sm8150-mainline/blob/nabu-6.7/drivers/clk/qcom/dispcc-sm8250.c UFS and other core clocks in https://github.com/map220v/sm8150-mainline/blob/nabu-6.7/drivers/clk/qcom/gcc-sm8150.c This is where "clock status stuck" happens https://github.com/map220v/sm8150-mainline/blob/9894da172d3f6433475489d2d3332dd7b437c105/drivers/clk/qcom/clk-branch.c#L86 And dsi phy 7nm which has dsi phy suspend(dsi_7nm_phy_disable)/resume(dsi_7nm_phy_enable) code, it requests clock driver to enable/disable or change frequency of clocks https://github.com/map220v/sm8150-mainline/blob/nabu-6.7/drivers/gpu/drm/msm/dsi/phy/dsi_phy_7nm.c

catmengi commented 3 months ago

May it just because it have not enough time? Eg ~200 cpu cycles might be not enough?(Clk-branch.c ± 80 line)

catmengi commented 3 months ago

Dsi_phy_7nm.c have big differences beetwen versions 6.1 to 6.7, maybe just some patch broke it for nabu🤔

elish with sm8250 also has Dual DSI, but dsi suspend and resume seem to work fine there, and they don't have clocks issues with UFS, so it's probably something wrong with clocks or clocks dependency handling on sm8150. Without this hack UFS has around 30% chance to start without clock error, maybe same applies to dsi clocks, because suspend and resume worked for me 2 or 3 times in 6.6, or maybe it didn't fully suspended. Also disabling one of these nodes "dsi1, dsi0 or dispcc" also fixes ufs clocks issue.

May this be because patches with devlink "optimizing"? And what dsi1 and dsi0 does(where the screen lives)? And which part of kernel control thoose "clocks". Nabu android kernel dont have this issue, can we get code from there and rework?

DSI is a protocol that usually uses four data lines and one clock line to transfer dsi commands and video data to display controller(nt36523), we have dual dsi configuration that uses two DSI controllers for one display, in this configuration DSI0 usually used to draw left half of display and DSI1 right half. It would be easier to use 6.1 kernel code, because everything works fine there, but I already tested all changes in drivers/clk/qcom and drivers/gpu/drm/msm/dsi/phy folder, and im pretty sure that clocks driver and dsi driver is fine in 6.2-6.7, devlink changes in 6.2 probably causes drivers to start in different order and that somehow affects stability of sm8150 clocks.

Which part of the kernel control this "clocks"?

Display clocks specified in https://github.com/map220v/sm8150-mainline/blob/nabu-6.7/drivers/clk/qcom/dispcc-sm8250.c UFS and other core clocks in https://github.com/map220v/sm8150-mainline/blob/nabu-6.7/drivers/clk/qcom/gcc-sm8150.c This is where "clock status stuck" happens

https://github.com/map220v/sm8150-mainline/blob/9894da172d3f6433475489d2d3332dd7b437c105/drivers/clk/qcom/clk-branch.c#L86

And dsi phy 7nm which has dsi phy suspend(dsi_7nm_phy_disable)/resume(dsi_7nm_phy_enable) code, it requests clock driver to enable/disable or change frequency of clocks https://github.com/map220v/sm8150-mainline/blob/nabu-6.7/drivers/gpu/drm/msm/dsi/phy/dsi_phy_7nm.c

Ok, first what i check will be reverting devlink changes, because they changed too much, and they might be a problem for sm8150

catmengi commented 3 months ago

If in 6.1 and previos kernels dsi a ufs clocks fine and dsi and clk drivers in newer kernels fine, is this mean high chance of that the fw_devlink "optimization" broke something on this platform?

catmengi commented 3 months ago

Does nabu6.0-rc1 already have broken dsi suspend? Or it broked in other branch?

map220v commented 3 months ago

Does nabu6.0-rc1 already have broken dsi suspend? Or it broked in other branch?

DSI and UFS clocks became unstable in 6.2 5.19-6.1 have dsi suspend/resume working, but with some patches/hacks(nabu-5.19 and nabu-6.0-rc1 already has needed patches)

map220v commented 3 months ago

Does nabu6.0-rc1 already have broken dsi suspend? Or it broked in other branch?

DSI and UFS clocks became unstable in 6.2 5.19-6.1 have dsi suspend/resume working, but with some patches/hacks(nabu-5.19 and nabu-6.0-rc1 already has needed patches)

Ok, i can mark 6.4 as start of unstable dsi, i will try to read commits related to fw_devlink. Can you tell me more about patches and hacks for nabu-5.19-6.0?

In 5.19 I used this hack to make dsi0, dsi1 and dispcc probe with new devlink. In 6.0-rc1 I used patches made by Saravana Kannan to fix devlink issues.

catmengi commented 3 months ago

Any ideas what drivers and commits broke clocks for dsi and ufs? Why this patches not working in version 6.2+?

map220v commented 3 months ago

Any ideas what drivers and commits broke clocks for dsi and ufs? Why this patches not working in version 6.2+?

I didn't checked fw_devlink commits, but I know that changes in dsi and clk folders are fine, and reverting them doesn't fix clock issue.

These patches won't apply to new kernels because of devlink changes, and some of these patches might be already applied to new kernels or they got replaced by other patch.

catmengi commented 3 months ago

Any ideas what drivers and commits broke clocks for dsi and ufs? Why this patches not working in version 6.2+?

I didn't checked fw_devlink commits, but I know that changes in dsi and clk folders are fine, and reverting them doesn't fix clock issue.

These patches won't apply to new kernels because of devlink changes, and some of these patches might be already applied to new kernels or they got replaced by other patch.

Is reverting all devlink changes after 6.1 and keep Saravana Kannan's pathes and your's might be a way?

catmengi commented 3 months ago

Does nabu6.0-rc1 already have broken dsi suspend? Or it broked in other branch?

DSI and UFS clocks became unstable in 6.2 5.19-6.1 have dsi suspend/resume working, but with some patches/hacks(nabu-5.19 and nabu-6.0-rc1 already has needed patches)

Ok, i can mark 6.4 as start of unstable dsi, i will try to read commits related to fw_devlink. Can you tell me more about patches and hacks for nabu-5.19-6.0?

In 5.19 I used this hack to make dsi0, dsi1 and dispcc probe with new devlink. In 6.0-rc1 I used patches made by Saravana Kannan to fix devlink issues.

Did you mean this and next commits related to fw_devlink fix the issue with dsi clocks on 6.0-rc1 --- 6 1?

map220v commented 3 months ago

Does nabu6.0-rc1 already have broken dsi suspend? Or it broked in other branch?

DSI and UFS clocks became unstable in 6.2 5.19-6.1 have dsi suspend/resume working, but with some patches/hacks(nabu-5.19 and nabu-6.0-rc1 already has needed patches)

Ok, i can mark 6.4 as start of unstable dsi, i will try to read commits related to fw_devlink. Can you tell me more about patches and hacks for nabu-5.19-6.0?

In 5.19 I used this hack to make dsi0, dsi1 and dispcc probe with new devlink. In 6.0-rc1 I used patches made by Saravana Kannan to fix devlink issues.

Did you mean this and next commits related to fw_devlink fix the issue with dsi clocks on 6.0-rc1 --- 6 1?

I think these commits were used to fix dsi probing fail, which was caused by devlink that could not properly resolve cycle depenedecy between dispcc, dsi0 and dsi1.

In new kernels (6.2 and higher) probing issue releated to devlink was fixed but some devlink improvements somehow broke clocks on sm8150 with dual dsi configuration. On other devices with sm8150 and without dual dsi, clocks work fine, and it's seems that on sm8250 with dual dsi, ufs and dsi clocks also getting enabled/disabled without being stuck.

catmengi commented 3 months ago

Does nabu6.0-rc1 already have broken dsi suspend? Or it broked in other branch?

DSI and UFS clocks became unstable in 6.2 5.19-6.1 have dsi suspend/resume working, but with some patches/hacks(nabu-5.19 and nabu-6.0-rc1 already has needed patches)

Ok, i can mark 6.4 as start of unstable dsi, i will try to read commits related to fw_devlink. Can you tell me more about patches and hacks for nabu-5.19-6.0?

In 5.19 I used this hack to make dsi0, dsi1 and dispcc probe with new devlink. In 6.0-rc1 I used patches made by Saravana Kannan to fix devlink issues.

Did you mean this and next commits related to fw_devlink fix the issue with dsi clocks on 6.0-rc1 --- 6 1?

I think these commits were used to fix dsi probing fail, which was caused by devlink that could not properly resolve cycle depenedecy between dispcc, dsi0 and dsi1.

In new kernels (6.2 and higher) probing issue releated to devlink was fixed but some devlink improvements somehow broke clocks on sm8150 with dual dsi configuration. On other devices with sm8150 and without dual dsi, clocks work fine, and it's seems that on sm8250 with dual dsi, ufs and dsi clocks also getting enabled/disabled without being stuck.

I think i will go in simpler way, i will rollback drivers/base/{parts related to fw_devlink} to 6.1 then apply Saravana Kannan's patches

catmengi commented 3 months ago

I think it would be easier to backport changee from your kernel to maverickjb kernel than try to found what change in fw_devlink causing error, Or is it possible to only port things related to fw_devlink from 6.1 to 6.7+? Edit: how can i connect with Gregh Kruah Hartman? He is mainter of driver base subsystem, he can possible fix this in upstream. Or he just ignore me

catmengi commented 3 months ago

Well i emailed greg, now waiting for his reply

catmengi commented 3 months ago

What part of kernel is capable of "dispcc: clock-controller@af00000"

catmengi commented 3 months ago

Can you check this patches: https://lore.kernel.org/lkml/20231118123944.2202630-1-quic_skakitap@quicinc.com/T/#m069a280a10e0dd7338ae6dd31604d8fabf7fcb20

https://patchwork.kernel.org/project/linux-arm-msm/patch/20240313-videocc-sm8150-dt-node-v1-1-ae8ec3c822c2@quicinc.com/ they may be related to our problem. @LesGaR on xda said that with this patches suspend/wakeup should work, he tried it(pre latest post on xda ubuntu on nabu)

map220v commented 3 months ago

Can you check this patches: https://lore.kernel.org/lkml/20231118123944.2202630-1-quic_skakitap@quicinc.com/T/#m069a280a10e0dd7338ae6dd31604d8fabf7fcb20

https://patchwork.kernel.org/project/linux-arm-msm/patch/20240313-videocc-sm8150-dt-node-v1-1-ae8ec3c822c2@quicinc.com/ they may be related to our problem. @LesGaR on xda said that with this patches suspend/wakeup should work, he tried it(pre latest post on xda ubuntu on nabu)

This patch adds missing venus resets and pm support for venus clock driver, venus are not releated to dsi, and I don't think I have videocc node enabled, so i don't see how these patches can fix dsi suspend/resume.

Also some DE's sometimes don't do full suspend, instead they just decrease brightness to 0 and paint display black, that could explain how LesGaR has suspend and resume working.

catmengi commented 3 months ago

Can you check this patches: https://lore.kernel.org/lkml/20231118123944.2202630-1-quic_skakitap@quicinc.com/T/#m069a280a10e0dd7338ae6dd31604d8fabf7fcb20 https://patchwork.kernel.org/project/linux-arm-msm/patch/20240313-videocc-sm8150-dt-node-v1-1-ae8ec3c822c2@quicinc.com/ they may be related to our problem. @LesGaR on xda said that with this patches suspend/wakeup should work, he tried it(pre latest post on xda ubuntu on nabu)

This patch adds missing venus resets and pm support for venus clock driver, venus are not releated to dsi, and I don't think I have videocc node enabled, so i don't see how these patches can fix dsi suspend/resume.

Also some DE's sometimes don't do full suspend, instead they just decrease brightness to 0 and paint display black, that could explain how LesGaR has suspend and resume working.

Well how about reverting fw_devlink to 6.0-rc1 state? Or revert to 6.0-rc1 's state full devlink?

catmengi commented 3 months ago

Can you check this patches: https://lore.kernel.org/lkml/20231118123944.2202630-1-quic_skakitap@quicinc.com/T/#m069a280a10e0dd7338ae6dd31604d8fabf7fcb20 https://patchwork.kernel.org/project/linux-arm-msm/patch/20240313-videocc-sm8150-dt-node-v1-1-ae8ec3c822c2@quicinc.com/ they may be related to our problem. @LesGaR on xda said that with this patches suspend/wakeup should work, he tried it(pre latest post on xda ubuntu on nabu)

This patch adds missing venus resets and pm support for venus clock driver, venus are not releated to dsi, and I don't think I have videocc node enabled, so i don't see how these patches can fix dsi suspend/resume.

Also some DE's sometimes don't do full suspend, instead they just decrease brightness to 0 and paint display black, that could explain how LesGaR has suspend and resume working.

Maybe they will not fix dsi but they may fix clocks, and fix dsi by this. After 6d i can test this commits. MAYBE they work, but chance of this is low :(

map220v commented 3 months ago

Can you check this patches: https://lore.kernel.org/lkml/20231118123944.2202630-1-quic_skakitap@quicinc.com/T/#m069a280a10e0dd7338ae6dd31604d8fabf7fcb20 https://patchwork.kernel.org/project/linux-arm-msm/patch/20240313-videocc-sm8150-dt-node-v1-1-ae8ec3c822c2@quicinc.com/ they may be related to our problem. @LesGaR on xda said that with this patches suspend/wakeup should work, he tried it(pre latest post on xda ubuntu on nabu)

This patch adds missing venus resets and pm support for venus clock driver, venus are not releated to dsi, and I don't think I have videocc node enabled, so i don't see how these patches can fix dsi suspend/resume. Also some DE's sometimes don't do full suspend, instead they just decrease brightness to 0 and paint display black, that could explain how LesGaR has suspend and resume working.

Maybe they will not fix dsi but they may fix clocks, and fix dsi by this. After 6d i can test this commits. MAYBE they work, but chance of this is low :(

Venus is separate block from dsi, venus clocks don't depend on dsi clocks, and dsi clocks don't depend on venus clock. Also i just checked, and we don't have videocc node, so applying these videocc patches won't fix anything.

DSI is already works fine until suspend/resume in 6.7(for our DualDSI C-PHY configuration), issue that we have is that clocks's disabling/enabling breaks in some cases, like when booting UFS clocks have chance to fail being disabled or enabled, also I just found logs from other sm8150 device that has same issue: https://pastebin.com/sMDDPAuk

catmengi commented 3 months ago

Well, now i need help with geting code of function from kernel panic stack trace. How do i get C code from there? I know that i need to use gdb for debugging platform and vmlinuz, but how i feed line like this yet_another_func+0x01/0x02 to gdb?

map220v commented 3 months ago

Well, now i need help with geting code of function from kernel panic stack trace. How do i get C code from there? I know that i need to use gdb for debugging platform and vmlinuz, but how i feed line like this yet_another_func+0x01/0x02 to gdb?

In gdb with vmlinuz loaded run "list *(yet_another_func+0x01)" or use decode_stacktrace.sh

catmengi commented 3 months ago

Ok now i know how to list code from function, but its not so helpful if i dont know local variables' values. How can i get them if i have core dump? And how to say to kernel to automatically make coredumps at kernel panic?

map220v commented 3 months ago

Ok now i know how to list code from function, but its not so helpful if i dont know local variables' values. How can i get them if i have core dump? And how to say to kernel to automatically make coredumps at kernel panic?

I think the only way to generate coredump after kernel panic is using kdump, but idk if it works on arm64. If kdump doesn't work then use printk in code that crashes. To read coredump run "gdb " and "info locals" to get local variables.

catmengi commented 3 months ago

Consulting to RHEL documentation about kdump it should work on arm64. Can i send you logs and kdump file when i be able to do it ±5days?

map220v commented 3 months ago

Consulting to RHEL documentation about kdump it should work on arm64. Can i send you logs and kdump file when i be able to do it ±5days?

Send them, but I don't think it will help, because info about clocks state that linux clock framework has, not always matches state that clocks actually have, so this issue can be hard to debug without knowing real state of dsi clocks.

catmengi commented 3 months ago

Found something related to UFS PHY clock issue on sm8150 https://lkml.indiana.edu/hypermail/linux/kernel/2312.2/01458.html this commit in linux github repo: https://github.com/torvalds/linux/commit/eff7496b72810ca54da8c9c4542bf2aca479dd44

catmengi commented 3 months ago

Found something related to UFS PHY clock issue on sm8150 https://lkml.indiana.edu/hypermail/linux/kernel/2312.2/01458.html this commit in linux github repo: torvalds/linux@eff7496

Will this commit fix UFS boot crash without udelay(1)?

map220v commented 3 months ago

Found something related to UFS PHY clock issue on sm8150 https://lkml.indiana.edu/hypermail/linux/kernel/2312.2/01458.html this commit in linux github repo: torvalds/linux@eff7496

Will this commit fix UFS boot crash without udelay(1)?

Looks promising, it could fix issue with UFS clocks.

catmengi commented 3 months ago

Found something related to UFS PHY clock issue on sm8150 https://lkml.indiana.edu/hypermail/linux/kernel/2312.2/01458.html this commit in linux github repo: torvalds/linux@eff7496

Will this commit fix UFS boot crash without udelay(1)?

Looks promising, it could fix issue with UFS clocks.

Well this is from upstream kernel 😃

catmengi commented 3 months ago

Can you build this kernel with this patch and removed udelay in clk and post it on xda to someone test it?

P.s. i cant build it on my samsung a50, so im ask you for this

map220v commented 3 months ago

sm8150-test.zip

catmengi commented 3 months ago

Did you remove udelay?