beagleboard / linux

The official Read Only BeagleBoard and BeagleBone kernel repository https://git.beagleboard.org/beagleboard/linux
http://beagleboard.org/source
Other
715 stars 566 forks source link

BeagleBone DMTimer2 unexpected stop after one or more days #203

Open mgkiller7 opened 5 years ago

mgkiller7 commented 5 years ago

we encounter DMTimer2 unexpected stop in am335x after run 1 or more days, we indeed seen gp_timer in /proc/interrupts never increase any more; we have try beagleBoard github kernel version 4.4.113/4.4.155 with our own rootfs in Beagebone Black board and our custom board, the situation is the same, Eventhought i don't make any change in kernel source.

This timer is initialized for clockevent in omap2_gp_clockevent_init(clkev_nr, clkev_src, clkev_prop); //arch/arm/mach-omap2/timer.c

below is the related call stack:

omap3_gptimer_timer_init(void) =>

__omap_sync32k_timer_init(2, "timer_sys_ck", NULL, \ 1, "timer_sys_ck", "ti,timer-alwon", true); =>

omap2_gp_clockevent_init(clkev_nr, clkev_src, clkev_prop);

after DMTimer2 unexpected stop, those things happen:

1、gp_timer in /proc/interrupts NEVER increases

2、get time form date cmd may goback some minues or seconds

3、user apps no longer output debug log in console, it seems the scheduler of kernel do not work correctly.

  but shell in console work fine, network ping is also work fine.

4、cpu load of threads in top cmd are all 0%

By the way, i checked after situation come out, ST bit of the DMTimer2's TCLR is 1 (that is Start timer)

But If i stop DMTimer2 manually in console shell by cmd: devmem 0x48040038 32 0x0

then i can reproduced the 1/2/3 situation mentioned above, but hung while i type cmd top in console shell.

So i think DMTimer2 of my AM335x is not work correctly after run one or more days.

We also try to comment out __omap_dm_timer_override_errata() in omap2_gp_clockevent_init(), this force to enable OMAP_TIMER_ERRATA_I103_I767, but the kernel can't bootup at all.

we also posted this problem in TI community at https://e2e.ti.com/support/processors/f/791/t/796508

pdp7 commented 4 years ago

@mgkiller7 did you find a resolution?

I see the last post is: https://e2e.ti.com/support/processors/f/791/p/796508/2978764#2978764

Recently i find out this problem is related to No initialization with PMIC in my u-boot in custom board. So in u-boot stage, the voltage supply to CORE and MPU from PMIC are 1.1V default. But when i configure the PMIC to supply 1.120V to CORE and 1.270V to MPU in am33xx_spl_board_init of u-boot (board.c), the problem disappeared.

DMTimer2 unexpected stop problem can reproduce when i delete PMIC change voltage in u-boot in BeagleBone Black.

pdp7 commented 4 years ago

@mgkiller7 Please re-open if still an issue.

You may also be interested in the Debian images and kernel builds that we are currently testing for the next release: https://elinux.org/Beagleboard:Latest-images-testing

wiltshiretom commented 3 years ago

Would love to know if you found a solution to this problem, we are seeing a very similar problem

pdp7 commented 3 years ago

@wiltshiretom Please run

sudo /opt/scripts/tools/version.sh

which will show the uboot and linux versions and what device tree overlays are present.

wiltshiretom commented 3 years ago

Apologies, my post lacked some detail! I'm seeing an identical issue but we are using a custom board (not beagle) also using the AM335x part. I found this thread and was wondering if you had identified a workaround. Sorry to resurrect something that is already closed on your platform but I am looking for inspiration. Also identical symptoms here: https://e2e.ti.com/support/processors/f/processors-forum/237808/am335x-system-time-looping

mgkiller7 commented 3 years ago

Please check the PMU voltage output for AM335x in your custom board. my board issue is belong to cpu power supply undervoltage. Hope this help. ----- 原始邮件 ----- 发件人:wiltshiretom @.> 收件人:beagleboard/linux @.> 抄送人:mgkiller7 @.>, Mention @.> 主题:Re: [beagleboard/linux] BeagleBone DMTimer2 unexpected stop after one or more days (#203) 日期:2021年04月02日 04点07分

Would love to know if you found a solution to this problem, we are seeing a very similar problem

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.