thesofproject / sof

Sound Open Firmware
Other
536 stars 308 forks source link

logger output delayed by 1hr 11mins after boot on BYT/CHT #683

Closed lgirdwood closed 5 years ago

lgirdwood commented 5 years ago

Logger output is delayed by over an hour (~1hr 11mins) on BYT/CHT platforms. It looks like either first timer work is delayed or notifications are blocked for > 1hr

[  +0.000019] sof-audio sof-audio: ipc tx: 0x30200000: GLB_TPLG_MSG: BUFFER_NEW
[  +0.000237] sof-audio sof-audio: ipc tx succeeded: 0x30200000: GLB_TPLG_MSG: BUFFER_NEW
[  +0.000024] sof-audio sof-audio: tplg: ready widget id 21 pipe 3 type 31 name : BUF3.2 stream none
[  +0.000004] sof-audio sof-audio: buffer BUF3.2: size 4608 caps 0x41
[  +0.000019] sof-audio sof-audio: ipc tx: 0x30200000: GLB_TPLG_MSG: BUFFER_NEW
[  +0.000130] sof-audio sof-audio: error: ipc error for 0x30200000 size 0x14
[  +0.000024] sof-audio sof-audio: buffer BUF3.2 load failed
[  +0.000008] sof-audio sof-audio: error: DSP failed to add widget id 0 type 31 name : BUF3.2 stream none reply -12
[  +0.000013] sof-audio sof-audio: ASoC: failed to load widget BUF3.2
[  +0.000008] sof-audio sof-audio: error: tplg component load failed -12
[  +0.000025] sof-audio sof-audio: error: failed to load DSP topology -22
[  +0.000008] sof-audio sof-audio: ASoC: failed to probe component -22
[  +0.000028] sof-nocodec sof-nocodec: ASoC: failed to instantiate card -22
[  +0.000071] sof-nocodec: probe of sof-nocodec failed with error -22
[  +0.000014] sof-audio sof-audio: created machine sof-nocodec
[  +0.004810] SSE version of gcm_enc/dec engaged.
[  +0.128899] iTCO_vendor_support: vendor-support=0
[  +0.020015] iTCO_wdt: Intel TCO WatchDog Timer Driver v1.11
[  +0.000201] iTCO_wdt: Found a Braswell SoC TCO device (Version=3, TCOBASE=0x0460)
[  +0.000330] iTCO_wdt: initialized. heartbeat=30 sec (nowayout=0)
[  +0.192247] intel_rapl: Found RAPL domain package
[  +0.000005] intel_rapl: Found RAPL domain core
[  +0.153358] Adding 1478652k swap on /dev/mmcblk0p3.  Priority:-2 extents:1 across:1478652k SSFS
[  +0.370291] audit: type=1400 audit(1544112357.823:2): apparmor="STATUS" operation="profile_load" profile="unconfined" name="content-hub-clipboard" pid=501 c
omm="apparmor_parser"
[  +0.001189] audit: type=1400 audit(1544112357.823:3): apparmor="STATUS" operation="profile_load" profile="unconfined" name="content-hub-peer-picker" pid=502
 comm="apparmor_parser"
[  +0.011706] audit: type=1400 audit(1544112357.835:4): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/sbin/dhclient" pid=504 comm="ap
parmor_parser"
[  +0.000010] audit: type=1400 audit(1544112357.835:5): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/lib/NetworkManager/nm-dhcp-
client.action" pid=504 comm="apparmor_parser"
[  +0.000006] audit: type=1400 audit(1544112357.835:6): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/lib/NetworkManager/nm-dhcp-helper" pid=504 comm="apparmor_parser"
[  +0.000004] audit: type=1400 audit(1544112357.835:7): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/lib/connman/scripts/dhclient-script" pid=504 comm="apparmor_parser"
[  +0.007604] audit: type=1400 audit(1544112357.843:8): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/lib/lightdm/lightdm-guest-session" pid=503 comm="apparmor_parser"
[  +0.000011] audit: type=1400 audit(1544112357.843:9): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/lib/lightdm/lightdm-guest-session//chromium" pid=503 comm="apparmor_parser"
[  +0.005494] audit: type=1400 audit(1544112357.847:10): apparmor="STATUS" operation="profile_load" profile="unconfined" name="url-dispatcher-bad-url-helper" pid=505 comm="apparmor_parser"
[  +0.003787] audit: type=1400 audit(1544112357.851:11): apparmor="STATUS" operation="profile_load" profile="unconfined" name="ubuntu-printing-app" pid=509 comm="apparmor_parser"
[  +1.392475] IPv6: ADDRCONF(NETDEV_UP): enp1s0: link is not ready
[  +0.003585] Generic PHY r8169-100:00: attached PHY driver [Generic PHY] (mii_bus:phy_addr=r8169-100:00, irq=IGNORE)
[  +0.132048] IPv6: ADDRCONF(NETDEV_UP): enp1s0: link is not ready
[  +0.000282] r8169 0000:01:00.0 enp1s0: Link is Down
[Dec 6 16:06] r8169 0000:01:00.0 enp1s0: Link is Up - 1Gbps/Full - flow control rx/tx
[  +0.000024] IPv6: ADDRCONF(NETDEV_CHANGE): enp1s0: link becomes ready
[Dec 6 16:10] random: crng init done
[  +0.000012] random: 7 urandom warning(s) missed due to ratelimiting
[Dec 6 17:17] sof-audio sof-audio: ipc rx: 0x90020000: GLB_TRACE_MSG
[  +0.000086] sof-audio sof-audio: ipc rx done: 0x90020000: GLB_TRACE_MSG
[  +0.137880] sof-audio sof-audio: ipc rx: 0x90020000: GLB_TRACE_MSG
plbossart commented 5 years ago

On 12/6/18 2:46 PM, Liam Girdwood wrote:

Logger output is delayed by over an hour (~1hr 11mins) on BYT/CHT platforms. It looks like either first timer work is delayed or notifications are blocked for > 1hr

|[ +0.000019] sof-audio sof-audio: ipc tx: 0x30200000: GLB_TPLG_MSG: BUFFER_NEW [ +0.000237] sof-audio sof-audio: ipc tx succeeded: 0x30200000: GLB_TPLG_MSG: BUFFER_NEW [ +0.000024] sof-audio sof-audio: tplg: ready widget id 21 pipe 3 type 31 name : BUF3.2 stream none [ +0.000004] sof-audio sof-audio: buffer BUF3.2: size 4608 caps 0x41 [ +0.000019] sof-audio sof-audio: ipc tx: 0x30200000: GLB_TPLG_MSG: BUFFER_NEW [ +0.000130] sof-audio sof-audio: error: ipc error for 0x30200000 size 0x14| |something's already wrong here.| |[ +0.000024] sof-audio sof-audio: buffer BUF3.2 load failed [ +0.000008] sof-audio sof-audio: error: DSP failed to add widget id 0 type 31 name : BUF3.2 stream none reply -12 [ +0.000013] sof-audio sof-audio: ASoC: failed to load widget BUF3.2 [ +0.000008] sof-audio sof-audio: error: tplg component load failed -12 [ +0.000025] sof-audio sof-audio: error: failed to load DSP topology -22 [ +0.000008] sof-audio sof-audio: ASoC: failed to probe component -22 [ +0.000028] sof-nocodec sof-nocodec: ASoC: failed to instantiate card -22 [ +0.000071] sof-nocodec: probe of sof-nocodec failed with error -22 [ +0.000014] sof-audio sof-audio: created machine sof-nocodec| |and here, an error isn't propagated. At this point do you even care about traces?| |[ +0.004810] SSE version of gcm_enc/dec engaged. [ +0.128899] iTCO_vendor_support: vendor-support=0 [ +0.020015] iTCO_wdt: Intel TCO WatchDog Timer Driver v1.11 [ +0.000201] iTCO_wdt: Found a Braswell SoC TCO device (Version=3, TCOBASE=0x0460) [ +0.000330] iTCO_wdt: initialized. heartbeat=30 sec (nowayout=0) [ +0.192247] intel_rapl: Found RAPL domain package [ +0.000005] intel_rapl: Found RAPL domain core [ +0.153358] Adding 1478652k swap on /dev/mmcblk0p3. Priority:-2 extents:1 across:1478652k SSFS [ +0.370291] audit: type=1400 audit(1544112357.823:2): apparmor="STATUS" operation="profile_load" profile="unconfined" name="content-hub-clipboard" pid=501 c omm="apparmor_parser" [ +0.001189] audit: type=1400 audit(1544112357.823:3): apparmor="STATUS" operation="profile_load" profile="unconfined" name="content-hub-peer-picker" pid=502 comm="apparmor_parser" [ +0.011706] audit: type=1400 audit(1544112357.835:4): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/sbin/dhclient" pid=504 comm="ap parmor_parser" [ +0.000010] audit: type=1400 audit(1544112357.835:5): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/lib/NetworkManager/nm-dhcp- client.action" pid=504 comm="apparmor_parser" [ +0.000006] audit: type=1400 audit(1544112357.835:6): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/lib/NetworkManager/nm-dhcp-helper" pid=504 comm="apparmor_parser" [ +0.000004] audit: type=1400 audit(1544112357.835:7): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/lib/connman/scripts/dhclient-script" pid=504 comm="apparmor_parser" [ +0.007604] audit: type=1400 audit(1544112357.843:8): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/lib/lightdm/lightdm-guest-session" pid=503 comm="apparmor_parser" [ +0.000011] audit: type=1400 audit(1544112357.843:9): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/lib/lightdm/lightdm-guest-session//chromium" pid=503 comm="apparmor_parser" [ +0.005494] audit: type=1400 audit(1544112357.847:10): apparmor="STATUS" operation="profile_load" profile="unconfined" name="url-dispatcher-bad-url-helper" pid=505 comm="apparmor_parser" [ +0.003787] audit: type=1400 audit(1544112357.851:11): apparmor="STATUS" operation="profile_load" profile="unconfined" name="ubuntu-printing-app" pid=509 comm="apparmor_parser" [ +1.392475] IPv6: ADDRCONF(NETDEV_UP): enp1s0: link is not ready [ +0.003585] Generic PHY r8169-100:00: attached PHY driver [Generic PHY] (mii_bus:phy_addr=r8169-100:00, irq=IGNORE) [ +0.132048] IPv6: ADDRCONF(NETDEV_UP): enp1s0: link is not ready [ +0.000282] r8169 0000:01:00.0 enp1s0: Link is Down [Dec 6 16:06] r8169 0000:01:00.0 enp1s0: Link is Up - 1Gbps/Full - flow control rx/tx [ +0.000024] IPv6: ADDRCONF(NETDEV_CHANGE): enp1s0: link becomes ready [Dec 6 16:10] random: crng init done [ +0.000012] random: 7 urandom warning(s) missed due to ratelimiting [Dec 6 17:17] sof-audio sof-audio: ipc rx: 0x90020000: GLB_TRACE_MSG [ +0.000086] sof-audio sof-audio: ipc rx done: 0x90020000: GLB_TRACE_MSG [ +0.137880] sof-audio sof-audio: ipc rx: 0x90020000: GLB_TRACE_MSG |

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/thesofproject/sof/issues/683, or mute the thread https://github.com/notifications/unsubscribe-auth/ABGIa6TlQa1ApNxToldayTqCz-QhiJ1Lks5u2YISgaJpZM4ZHSOw.

plbossart commented 5 years ago

Looking at the kernel, this should be trapped here:

    /* register machine driver, pass machine info as pdata */
    plat_data->pdev_mach =
        platform_device_register_data(sdev->dev, drv_name,
                          -1, mach, size);

    if (IS_ERR(plat_data->pdev_mach)) {
        ret = PTR_ERR(plat_data->pdev_mach);
        goto comp_err;
    }

    dev_dbg(sdev->dev, "created machine %s\n",
        dev_name(&plat_data->pdev_mach->dev));

Looks like we have the wrong test or something happens. You should not even go past that point

xiulipan commented 5 years ago

@lgirdwood You mean that you leave a BYT boot and let it there, then after 1 hr you see the dma-trace IPC comes?

I know in some situation that this may happen: https://github.com/thesofproject/sof/issues/286 Our work timer will fail to trigger the wok if the work is rescheduled too close that when it actually handled to be some time in the past. Thus it will wait to be run next round as the timer count overflow.

mengdonglin commented 5 years ago

@xiulipan @keyonjie @plbossart @lgirdwood Wait Xiuli and Keyon's fixing for #370 on both firmware scheduler and driver, and then revisit this issue?

lgirdwood commented 5 years ago

@mengdonglin yes, needs fixed.

lgirdwood commented 5 years ago

@mengdonglin seems like it's been fixed.