JeffyCN / mirrors

Mirrors of Rockchip BSP repositories, only contains yocto related ones to keep it thin since the yocto would try to clone the whole repository.
Other
13 stars 5 forks source link

DSI2 driver looping with -EPROBE_DEFER in kernel 6.1 #25

Closed ginkage closed 3 months ago

ginkage commented 4 months ago

This issue is not 100% reproducible (depends on which device you're using, what dts you have, which driver is used for your DSI panel, maybe even which DSI port, etc.), but when it happens, you end up with a non-working screen and sometimes unbootable device, which keeps logging these:

dw-mipi-dsi2 fde30000.dsi: [drm:dw_mipi_dsi2_bind] *ERROR* Failed to find panel or bridge: -517

This happens because the Rockchip DSI driver registers itself as MIPI host during probe, then proceeds to register all sub-components, then realize the panel isn't available yet (unless, for some reason, it managed to get probed sooner), and defer. This doesn't work with 6.1 though, and results in an endless loop. The kernel docs also advice against this:

-EPROBE_DEFER must not be returned if probe() has already created child devices, even if those child devices are removed again in a cleanup path. If -EPROBE_DEFER is returned after a child device has been registered, it may result in an infinite loop of .probe() calls to the same driver.

The correct way is to let the DSI driver register as host, then wait for the panel driver to be registered and probed, and then let the panel driver trigger attach, that will in turn register the DSI driver's sub-components.

Please refer to this commit: https://github.com/armbian/linux-rockchip/pull/168 that seems to have fixed the issue for those who had it. It would be nice to have this fixed in the Rockchip's own repository as well.

JeffyCN commented 4 months ago

thanx for fixing it, will forward this to the vop team.

huangzb1221 commented 4 months ago

Can you tell me the specific scene? The probe process of dsi is usually executed before the probe process of panel, but if the panel has no configuration problems (the panel can complete the probe process normally), dsi will basically succeed in the second probe process. Only when there is a problem with the dts configuration of the panel and the panel probe fails to execute, will the dsi continue to defer the probe.

JeffyCN commented 4 months ago

the current dsi drivers would try to check it's panel in the xxx_bind() and might return defer error.

the patch is trying to delay xxx_bind() after panel probed to avoid that.

ginkage commented 4 months ago

Only when there is a problem with the dts configuration of the panel and the panel probe fails to execute, will the dsi continue to defer the probe.

The problem is, just like the kernel doc says, it gets stuck in defer loop, and never gets to probe the panel driver in the first place. It works as you described on 5.10, but not on 6.1.

ginkage commented 4 months ago

Please also take a look at another driver that made a similar change for similar reasons: https://lore.kernel.org/lkml/20220131085520.287105-1-angelogioacchino.delregno@collabora.com/T/

huangzb1221 commented 4 months ago

It is still different from what is described in the above patch link. I read the code before the patch. After failing to find the panel or bridge, mipi_dsi_host_unregister will be called, causing the panel to be unable to execute the probe. However, in our code, mipi_dsi_host_register can still be executed normally, and if find panel fails, mipi_dsi_host_unregister will not be executed, so it will not affect the normal probe of the panel. Therefore, if the panel probe is normal, dsi will not defer the probe indefinitely.

JeffyCN commented 4 months ago

I did some tests on my RK3588 EVB, and it would not cause an infinite loop indeed.

That is because in my current setup, the component master(rockchip-drm) is probed later than DSI, so the DSI's component_add() does not attempt to perform binding, and just returns 0.

To reproduce this issue, i manually delayed the probe of DSI and panel after rockchip-drm. This still did not cause an infinite loop("thanks" to the missing cleanup of mipi_dsi_host_register() and unfreed devres), but there were multiple DSI hosts registered with warnings about duplicated sysfs/device/kobject/component. The kernel also got stuck when rebooting(due to multiple registered DSI i guess).

Anyway, this patch seems promising, as described in "DOC: special care dsi" in drm_bridge.c.

JeffyCN commented 3 months ago

This has been fixed in the internal repo.