Samsung / TizenRT

TizenRT is a lightweight RTOS-based platform to support low-end IoT devices
Apache License 2.0
566 stars 571 forks source link

tools/debug: Modify the trap tool to print exact line number #6308

Closed kishore-sn closed 2 months ago

kishore-sn commented 2 months ago

The trap tool makes use of nm tool to look up the symbols. However the nm tool does not give exact line number for the given address. So, we change the trap tool to use nm only for symbol name and then get the exact line number using the addr2line tool.

Trap output before change:

Stack_address    Symbol_address  Symbol location  Symbol_name       File_name
 User stack
0x6068cbb4   0xe01dc49   kernel       sig_dispatch  /root/tizenrt/os/kernel/signal/sig_dispatch.c:435
0x6068cbc8   0xe0293f9   kernel       arm_sigdeliver    /root/tizenrt/os/arch/arm/src/armv7-a/arm_sigdeliver.c:71
0x6068cbd8   0xe180521   app1         libc_signal_main  /root/tizenrt/apps/examples/testcase/le_tc/kernel/tc_libc_signal.c:428
0x6068cc14   0xe180521   app1         libc_signal_main  /root/tizenrt/apps/examples/testcase/le_tc/kernel/tc_libc_signal.c:428
0x6068cc18   0xe191f95   app1         waitpid   /root/tizenrt/os/include/sys/wait.h:319
0x6068cd28   0xe180521   app1         libc_signal_main  /root/tizenrt/apps/examples/testcase/le_tc/kernel/tc_libc_signal.c:428
0x6068cd30   0xe180209   app1         libc_semaphore_main   /root/tizenrt/apps/examples/testcase/le_tc/kernel/tc_libc_semaphore.c:152
0x6068cd34   0xe1a9ca4   app1         utils_ttypenames  /root/tizenrt/apps/system/utils/utils_ps.c:94
0x6068cd38   0xe1aa1a0   app1         l_day
0x6068cd70   0xe170425   app1         tc_kernel_main    /root/tizenrt/apps/examples/testcase/le_tc/kernel/kernel_tc_main.c:43
0x6068cd74   0xe170419   app1         tc_get_drvfd  /root/tizenrt/apps/examples/testcase/le_tc/kernel/kernel_tc_main.c:35
0x6068cd78   0xe16ba21   app1         task_startup  /root/tizenrt/os/include/tinyara/userspace.h:145
0x6068cd80   0xe01abb9   kernel       task_start    /root/tizenrt/os/kernel/task/task_start.c:133
0x6068cd98   0xe01a2b9   kernel       group_zalloc  /root/tizenrt/os/include/tinyara/kmalloc.h:166

Trap output after change:

Stack_address    Symbol_address  Symbol_location  Symbol_name       File_name
 User stack
0x6068cbb4   0xe01dc63   kernel binary    sig_dispatc           /root/tizenrt/os/kernel/signal/sig_deliver.c:104
0x6068cbc8   0xe029429   kernel binary    arm_sigdelive         /root/tizenrt/os/arch/arm/src/armv7-a/arm_sigdeliver.c:108 (discriminator 1)
0x6068cbd8   0xe180c31   common binary    libc_signal_mai       /root/tizenrt/apps/examples/testcase/le_tc/kernel/tc_libc_signal.c:414
0x6068cc14   0xe180c31   common binary    libc_signal_mai       /root/tizenrt/apps/examples/testcase/le_tc/kernel/tc_libc_signal.c:414
0x6068cc18   0xe191fa2   common binary    waitpi                /root/tizenrt/os/syscall/proxies/PROXY_waitpid.c:12
0x6068cd28   0xe180c31   common binary    libc_signal_mai       /root/tizenrt/apps/examples/testcase/le_tc/kernel/tc_libc_signal.c:414
0x6068cd30   0xe18047b   common binary    libc_semaphore_mai    /root/tizenrt/apps/examples/testcase/le_tc/kernel/tc_libc_semaphore.c:81
Symbol not found for address: 0xe1aa01c
Symbol not found for address: 0xe1ad39a
0x6068cd70   0xe17047d   common binary    tc_kernel_mai         /root/tizenrt/apps/examples/testcase/le_tc/kernel/kernel_tc_main.c:124
0x6068cd74   0xe170425   common binary    tc_get_drvf           /root/tizenrt/apps/examples/testcase/le_tc/kernel/kernel_tc_main.c:46
0x6068cd78   0xe16ba39   common binary    task_startu           /root/tizenrt/lib/libc/sched/task_startup.c:123 (discriminator 3)
0x6068cd80   0xe01ac03   kernel binary    task_star             /root/tizenrt/os/kernel/task/task_start.c:162
0x6068cd98   0xe01a2c1   kernel binary    group_zallo           /root/tizenrt/os/kernel/group/group_zalloc.c:104
kishore-sn commented 2 months ago

@kishore-sn Are you able to add before and after of TRAP output? (Not all, the problem line is enough)

I updated the before and after output. As you can see, there is not much difference in the output format. But the line numbers are now pointing to the exact location of the address and not the start of the function.

sunghan-chang commented 2 months ago

@kishore-sn Are you able to add before and after of TRAP output? (Not all, the problem line is enough)

I updated the before and after output. As you can see, there is not much difference in the output format. But the line numbers are now pointing to the exact location of the address and not the start of the function.

Sorry I don't get the point. Before and After do not show the same content so that I don't know what is changed. Could you let me know line number of the example?

kishore-sn commented 2 months ago

In case of the current crash log, the PC points to leave_critical_section.

    [ Current location (PC) of assert ]
    - symbol addr        : 0x0e019fd0
    - function name      : leave_critical_section
    - file               : /root/tizenrt/os/kernel/irq/irq_csection.c:555

So, the call stack must show exactly who called leave critical section. The first entry in the call stack seems to be wrong in both before and after case. But for the second line in call stack, the before case shows 0x6068cbc8 0xe0293f9 kernel arm_sigdeliver /root/tizenrt/os/arch/arm/src/armv7-a/arm_sigdeliver.c:71

If we check arm_sigdeliver.c Line 71, it is the start of function sigdeliver. This does not help in debugging.

 69  
 70 #ifndef CONFIG_DISABLE_SIGNALS
 71 void arm_sigdeliver(void)
 72 {
 73   struct tcb_s *rtcb = this_task();
 74   uint32_t *regs = rtcb->xcp.saved_regs;

But in the after case, we get /root/tizenrt/os/arch/arm/src/armv7-a/arm_sigdeliver.c:108 (discriminator 1) Line number 108 points just after the call to leave critical section. So this allows us to track the call stack properly.

103 
104   do
105     {
106       leave_critical_section(regs[REG_CPSR]);
107     }
108   while (rtcb->irqcount > 0);
109 #endif /* CONFIG_SMP */

This issue is much more evident in other crash logs, where the actual cause of crash might not be the current PC, but some previous function in the call stack.

Another thing that you can notice is the below line in before and after case: 0x6068cc18 0xe191f95 app1 waitpid /root/tizenrt/os/include/sys/wait.h:319 0x6068cc18 0xe191fa2 common binary waitpi /root/tizenrt/os/syscall/proxies/PROXY_waitpid.c:12

The before case is printing the header file which is meaningless for debugging. We need the actual source file of the api. This is shown in after case.

sunghan-chang commented 2 months ago

In case of the current crash log, the PC points to leave_critical_section.

  [ Current location (PC) of assert ]
  - symbol addr        : 0x0e019fd0
  - function name      : leave_critical_section
  - file               : /root/tizenrt/os/kernel/irq/irq_csection.c:555

So, the call stack must show exactly who called leave critical section. The first entry in the call stack seems to be wrong in both before and after case. But for the second line in call stack, the before case shows 0x6068cbc8 0xe0293f9 kernel arm_sigdeliver /root/tizenrt/os/arch/arm/src/armv7-a/arm_sigdeliver.c:71

If we check arm_sigdeliver.c Line 71, it is the start of function sigdeliver. This does not help in debugging.

 69  
 70 #ifndef CONFIG_DISABLE_SIGNALS
 71 void arm_sigdeliver(void)
 72 {
 73   struct tcb_s *rtcb = this_task();
 74   uint32_t *regs = rtcb->xcp.saved_regs;

But in the after case, we get /root/tizenrt/os/arch/arm/src/armv7-a/arm_sigdeliver.c:108 (discriminator 1) Line number 108 points just after the call to leave critical section. So this allows us to track the call stack properly.

103 
104   do
105     {
106       leave_critical_section(regs[REG_CPSR]);
107     }
108   while (rtcb->irqcount > 0);
109 #endif /* CONFIG_SMP */

This issue is much more evident in other crash logs, where the actual cause of crash might not be the current PC, but some previous function in the call stack.

Another thing that you can notice is the below line in before and after case: 0x6068cc18 0xe191f95 app1 waitpid /root/tizenrt/os/include/sys/wait.h:319 0x6068cc18 0xe191fa2 common binary waitpi /root/tizenrt/os/syscall/proxies/PROXY_waitpid.c:12

The before case is printing the header file which is meaningless for debugging. We need the actual source file of the api. This is shown in after case.

@kishore-sn Can we add this in the commit description?

kishore-sn commented 2 months ago

@kishore-sn Can we add this in the commit description?

Done.