OE4T / meta-tegra

BSP layer for NVIDIA Jetson platforms, based on L4T
MIT License
410 stars 227 forks source link

Issues with Xavier switch_root #449

Closed danielwangksu closed 3 years ago

danielwangksu commented 4 years ago

I have a question on bundled initramfs switching to real rootfs.img. I use a slightly different version of meta-tegra/recipes-core/initrdscripts/tegra-minimal-init/init-boot.sh I modified to fit my scenario.

Instead of mount emmc and switch to that emmc partition, my script create a new ramfs then load a cpio formatted rootfs.img into that ramfs and then switch into it. This works fine with Xavier dev kit but when adding additional initscripts the kernel panic and reboot (I will attach the log in the end)

Here is my init-boot.sh (the part I changed):

echo "Mounting ${rootdev}..."
mkdir /mnt/rootfs
mount -t ext4 ${rootdev} /mnt/rootfs
mkdir /mnt/switch-root
mount -o size=2G -t tmpfs none /mnt/switch-root
cp /mnt/rootfs/boot/rootfs.img /mnt/switch-root/rootfs.img.gz
cd /mnt/switch-root
gzip -d rootfs.img.gz
cpio -i < rootfs.img
rm rootfs.img

echo "Switching to rootfs on /mnt/rootfs/boot/rootfs.img..."
mount --move /sys  /mnt/switch-root/sys
mount --move /proc /mnt/switch-root/proc
mount --move /dev  /mnt/switch-root/dev
exec switch_root /mnt/switch-root /sbin/init

Did I do something wrong? Any potential issue that I have not considered before? Thank you very much!

Dump for CPU0:
pid: 3587  comm: udevd
  x0 0000000000000040   x1 000000006f3e6f38
  x2 0000000000000006   x3 0000000000006f38
  x4 ffffffbf0fa417d0   x5 ffffff8008245f00
  x6 ffffffc3b69d3e20   x7 afb504000afb5041
  x8 0000000000000000   x9 ffffffc31ca53780
 x10 0000000000000a20  x11 0000000000001c18
 x12 000000000000a167  x13 00000000008b712a
 x14 0000000000000001  x15 ffffffc3ececec28
 x16 0000000000000003  x17 0000000000000002
 x18 0000000000000f43  x19 ffffff800aa57ac0
 x20 ffffff800aa560c0  x21 ffffffc3ffdafe20
 x22 0000000000000000  x23 ffffff80081d6b98
 x24 0000000000000000  x25 ffffff800aa57ac0
 x26 0000000000001a00  x27 ffffffbf0daf2780
 x28 ffffffc3bbcc7000  x29 ffffffc31ca536f0
 x30 ffffff80081d7ed4   sp ffffffc31ca536f0
  pc ffffff8008f54374 cpsr 004000c5 (EL1h)

_raw_spin_lock_irqsave+0x54/0x70:
  pc ffffff8008f54374   sp ffffffc31ca536f0   fp ffffffc31ca536f0
pagevec_lru_move_fn+0x8c/0x120:
  pc ffffff80081d7ed4   sp ffffffc31ca53700   fp ffffffc31ca53710
lru_add_drain_cpu+0x114/0x120:
  pc ffffff80081d8adc   sp ffffffc31ca53720   fp ffffffc31ca53770
lru_add_drain+0x38/0x68:
  pc ffffff80081d8db0   sp ffffffc31ca53780   fp ffffffc31ca537a0
exit_mmap+0x6c/0x118:
  pc ffffff8008209ffc   sp ffffffc31ca537b0   fp ffffffc31ca537c0
mmput+0x60/0x130:
  pc ffffff80080b0528   sp ffffffc31ca537d0   fp ffffffc31ca53880
do_exit+0x26c/0xa08:
  pc ffffff80080b95fc   sp ffffffc31ca53890   fp ffffffc31ca538a0
bug_handler.part.2+0x0/0x88:
  pc ffffff800808c528   sp ffffffc31ca538b0   fp ffffffc31ca53910
bad_mode+0x88/0x98:
  pc ffffff800808cc20   sp ffffffc31ca53920   fp ffffffc31ca53950
handle_serr+0x124/0x128:
  pc ffffff800808cdf4   sp ffffffc31ca53960   fp ffffffc31ca53980
el1_serr+0xb8/0x148:
  pc ffffff8008082d58   sp ffffffc31ca53990   fp ffffffc31ca53b10
pagevec_lru_move_fn+0x8c/0x120:
  pc ffffff80081d7ed4   sp ffffffc31ca53b20   fp ffffffc31ca53b30
__lru_cache_add+0xb0/0x110:
  pc ffffff80081d80a8   sp ffffffc31ca53b40   fp ffffffc31ca53b90
lru_cache_add+0x58/0xa8:
  pc ffffff80081d86d0   sp ffffffc31ca53ba0   fp ffffffc31ca53bc0
lru_cache_add_active_or_unevictable+0x70/0x158:
  pc ffffff80081d88e0   sp ffffffc31ca53bd0   fp ffffffc31ca53be0
wp_page_copy+0x334/0x870:
  pc ffffff80081fe74c   sp ffffffc31ca53bf0   fp ffffffc31ca53c00
do_wp_page+0xc4/0x5d0:
  pc ffffff8008200d2c   sp ffffffc31ca53c10   fp ffffffc31ca53c80
handle_mm_fault+0x69c/0xa68:
  pc ffffff80082043a4   sp ffffffc31ca53c90   fp ffffffc31ca53cf0
do_page_fault+0x308/0x518:
  pc ffffff80080a3698   sp ffffffc31ca53d00   fp ffffffc31ca53d80
do_mem_abort+0x54/0xb0:
  pc ffffff8008080954   sp ffffffc31ca53d90   fp ffffffc31ca53de0
do_el0_ia_bp_hardening+0x84/0x98:
  pc ffffff8008080a7c   sp ffffffc31ca53df0   fp ffffffc31ca53e90
el0_da+0x20/0x24:
  pc ffffff80080833c8   sp ffffffc31ca53ea0   fp 0000000000000000
debug>   x0 0000000000000040   x1 000000006f3e6f38
  x2 0000000000000006   x3 0000000000006f38
  x4 ffffffbf0fa417d0   x5 ffffff8008245f00
  x6 ffffffc3b69d3e20   x7 afb504000afb5041
  x8 0000000000000000   x9 ffffffc31ca53780
 x10 0000000000000a20  x11 0000000000001c18
 x12 000000000000a167  x13 00000000008b712a
 x14 0000000000000001  x15 ffffffc3ececec28
 x16 0000000000000003  x17 0000000000000002
 x18 0000000000000f43  x19 ffffff800aa57ac0
 x20 ffffff800aa560c0  x21 ffffffc3ffdafe20
 x22 0000000000000000  x23 ffffff80081d6b98
 x24 0000000000000000  x25 ffffff800aa57ac0
 x26 0000000000001a00  x27 ffffffbf0daf2780
 x28 ffffffc3bbcc7000  x29 ffffffc31ca536f0
 x30 ffffff80081d7ed4   sp ffffffc31ca536f0
  pc ffffff8008f54374 cpsr 004000c5 (EL1h)
 sp_el0   ffffffc3bbcc7000
 elr_el1  ffffff800811fd08
 spsr_el1 60c00045
debug> Dump for CPU3:
pid: 3585  comm: udevd
  x0 0000000000000040   x1 000000006f3c6f38
  x2 0000000000000004   x3 0000000000006f38
  x4 000000001418d000   x5 0000000000000001
  x6 0000000000000010   x7 ffffffc3b5c40ea0
  x8 0000000000014163   x9 0000000000000000
 x10 0000000000000000  x11 0000000000000000
 x12 0000000000000000  x13 0000000000000000
 x14 ffffffc3e9a60810  x15 000000000000000a
 x16 ffffff800820ad78  x17 0000007f90072538
 x18 0000007fecc768bf  x19 ffffff800aa57ac0
 x20 ffffff800aa560c0  x21 ffffffc3ffdf4e20
 x22 0000000000000000  x23 ffffff80081d6b98
 x24 0000000000000000  x25 ffffff800aa57ac0
 x26 0000000000001a00  x27 ffffffbf0dae1340
 x28 ffffffc3ec3ef000  x29 ffffffc2f9813c70
 x30 ffffff80081d7ed4   sp ffffffc2f9813c70
  pc ffffff8008f54374 cpsr 004000c5 (EL1h)

_raw_spin_lock_irqsave+0x54/0x70:
  pc ffffff8008f54374   sp ffffffc2f9813c70   fp ffffffc2f9813c70
pagevec_lru_move_fn+0x8c/0x120:
  pc ffffff80081d7ed4   sp ffffffc2f9813c80   fp ffffffc2f9813c90
lru_add_drain_cpu+0x114/0x120:
  pc ffffff80081d8adc   sp ffffffc2f9813ca0   fp ffffffc2f9813cf0
lru_add_drain+0x38/0x68:
  pc ffffff80081d8db0   sp ffffffc2f9813d00   fp ffffffc2f9813d20
unmap_region+0x44/0xf0:
  pc ffffff80082077d4   sp ffffffc2f9813d30   fp ffffffc2f9813d40
do_munmap+0x22c/0x3a8:
  pc ffffff8008208e4c   sp ffffffc2f9813d50   fp ffffffc2f9813e20
SyS_brk+0x124/0x178:
  pc ffffff800820ae9c   sp ffffffc2f9813e30   fp ffffffc2f9813e80
el0_svc_naked+0x34/0x38:
  pc ffffff80080838c0   sp ffffffc2f9813e90   fp 0000000000000000
debug>   x0 0000000000000040   x1 000000006f3c6f38
  x2 0000000000000004   x3 0000000000006f38
  x4 000000001418d000   x5 0000000000000001
  x6 0000000000000010   x7 ffffffc3b5c40ea0
  x8 0000000000014163   x9 0000000000000000
 x10 0000000000000000  x11 0000000000000000
 x12 0000000000000000  x13 0000000000000000
 x14 ffffffc3e9a60810  x15 000000000000000a
 x16 ffffff800820ad78  x17 0000007f90072538
 x18 0000007fecc768bf  x19 ffffff800aa57ac0
 x20 ffffff800aa560c0  x21 ffffffc3ffdf4e20
 x22 0000000000000000  x23 ffffff80081d6b98
 x24 0000000000000000  x25 ffffff800aa57ac0
 x26 0000000000001a00  x27 ffffffbf0dae1340
 x28 ffffffc3ec3ef000  x29 ffffffc2f9813c70
 x30 ffffff80081d7ed4   sp ffffffc2f9813c70
  pc ffffff8008f54374 cpsr 004000c5 (EL1h)
 sp_el0   ffffffc3ec3ef000
 elr_el1  0000007f900bfd7c
 spsr_el1 80000000
debug> Dump for CPU1:
pid: 3589  comm: sh
  x0 0000000000000040   x1 000000006f3b6f38
  x2 0000000000000003   x3 0000000000006f38
  x4 0000000000000039   x5 ffffffc3e9999dc0
  x6 ffffffc3ee2aa000   x7 0000000000000000
  x8 0000000000000010   x9 000000046474e552
 x10 000000000001bc18  x11 000000000002bc18
 x12 000000000002bc18  x13 00000000000003e8
 x14 00000000000003e8  x15 0000000004748945
 x16 ffffff80082657f8  x17 0000007fb0e6e850
 x18 0000000000000000  x19 ffffff800aa57ac0
 x20 ffffff800aa560c0  x21 ffffffc3ffdc6e20
 x22 0000000000000000  x23 ffffff80081d6b98
 x24 0000000000000000  x25 ffffff800aa57ac0
 x26 0000000000001a00  x27 ffffffbf0dae8a40
 x28 ffffffc3e9a790b0  x29 ffffffc31ca5fae0
 x30 ffffff80081d7ed4   sp ffffffc31ca5fae0
  pc ffffff8008f54374 cpsr 004000c5 (EL1h)

_raw_spin_lock_irqsave+0x54/0x70:
  pc ffffff8008f54374   sp ffffffc31ca5fae0   fp ffffffc31ca5fae0
pagevec_lru_move_fn+0x8c/0x120:
  pc ffffff80081d7ed4   sp ffffffc31ca5faf0   fp ffffffc31ca5fb00
lru_add_drain_cpu+0x114/0x120:
  pc ffffff80081d8adc   sp ffffffc31ca5fb10   fp ffffffc31ca5fb60
lru_add_drain+0x38/0x68:
  pc ffffff80081d8db0   sp ffffffc31ca5fb70   fp ffffffc31ca5fb90
exit_mmap+0x6c/0x118:
  pc ffffff8008209ffc   sp ffffffc31ca5fba0   fp ffffffc31ca5fbb0
mmput+0x60/0x130:
  pc ffffff80080b0528   sp ffffffc31ca5fbc0   fp ffffffc31ca5fc70
flush_old_exec+0x474/0x708:
  pc ffffff8008264494   sp ffffffc31ca5fc80   fp ffffffc31ca5fc90
load_elf_binary+0x27c/0xc98:
  pc ffffff80082cb5ac   sp ffffffc31ca5fca0   fp ffffffc31ca5fd00
search_binary_handler+0x98/0x288:
  pc ffffff8008264930   sp ffffffc31ca5fd10   fp ffffffc31ca5fdc0
do_execveat_common.isra.15+0x540/0x6a0:
  pc ffffff8008265430   sp ffffffc31ca5fdd0   fp ffffffc31ca5fe10
SyS_execve+0x4c/0x60:
  pc ffffff8008265844   sp ffffffc31ca5fe20   fp ffffffc31ca5fe90
el0_svc_naked+0x34/0x38:
  pc ffffff80080838c0   sp ffffffc31ca5fea0   fp 0000000000000000
debug>   x0 0000000000000040   x1 000000006f3b6f38
  x2 0000000000000003   x3 0000000000006f38
  x4 0000000000000039   x5 ffffffc3e9999dc0
  x6 ffffffc3ee2aa000   x7 0000000000000000
  x8 0000000000000010   x9 000000046474e552
 x10 000000000001bc18  x11 000000000002bc18
 x12 000000000002bc18  x13 00000000000003e8
 x14 00000000000003e8  x15 0000000004748945
 x16 ffffff80082657f8  x17 0000007fb0e6e850
 x18 0000000000000000  x19 ffffff800aa57ac0
 x20 ffffff800aa560c0  x21 ffffffc3ffdc6e20
 x22 0000000000000000  x23 ffffff80081d6b98
 x24 0000000000000000  x25 ffffff800aa57ac0
 x26 0000000000001a00  x27 ffffffbf0dae8a40
 x28 ffffffc3e9a790b0  x29 ffffffc31ca5fae0
 x30 ffffff80081d7ed4   sp ffffffc31ca5fae0
  pc ffffff8008f54374 cpsr 004000c5 (EL1h)
 sp_el0   ffffffc3c8f2c600
 elr_el1  ffffff8008f54374
 spsr_el1 004000c5
debug> Dump for CPU2:
pid: 3588  comm: sh
  x0 0000000000000040   x1 000000006f3d6f38
  x2 0000000000000005   x3 0000000000006f38
  x4 0000000000000039   x5 ffffffc3e999bfc0
  x6 ffffffc3ee2aa000   x7 0000000000000000
  x8 0000000000000010   x9 000000046474e552
 x10 000000000001bc18  x11 000000000002bc18
 x12 000000000002bc18  x13 00000000000003e8
 x14 00000000000003e8  x15 ffffffffffffffff
 x16 30636d6d2f74736f  x17 30303a30636d6d2f
 x18 0000000000000001  x19 ffffff800aa57ac0
 x20 ffffff800aa560c0  x21 ffffffc3ffddde20
 x22 0000000000000000  x23 ffffff80081d6b98
 x24 0000000000000000  x25 ffffff800aa57ac0
 x26 0000000000001a00  x27 ffffffbf0db0fbc0
 x28 ffffffc3b74b9790  x29 ffffffc31ca5bae0
 x30 ffffff80081d7ed4   sp ffffffc31ca5bae0
  pc ffffff8008f54374 cpsr 004000c5 (EL1h)

_raw_spin_lock_irqsave+0x54/0x70:
  pc ffffff8008f54374   sp ffffffc31ca5bae0   fp ffffffc31ca5bae0
pagevec_lru_move_fn+0x8c/0x120:
  pc ffffff80081d7ed4   sp ffffffc31ca5baf0   fp ffffffc31ca5bb00
lru_add_drain_cpu+0x114/0x120:
  pc ffffff80081d8adc   sp ffffffc31ca5bb10   fp ffffffc31ca5bb60
lru_add_drain+0x38/0x68:
  pc ffffff80081d8db0   sp ffffffc31ca5bb70   fp ffffffc31ca5bb90
exit_mmap+0x6c/0x118:
  pc ffffff8008209ffc   sp ffffffc31ca5bba0   fp ffffffc31ca5bbb0
mmput+0x60/0x130:
  pc ffffff80080b0528   sp ffffffc31ca5bbc0   fp ffffffc31ca5bc70
flush_old_exec+0x474/0x708:
  pc ffffff8008264494   sp ffffffc31ca5bc80   fp ffffffc31ca5bc90
load_elf_binary+0x27c/0xc98:
  pc ffffff80082cb5ac   sp ffffffc31ca5bca0   fp ffffffc31ca5bd00
search_binary_handler+0x98/0x288:
  pc ffffff8008264930   sp ffffffc31ca5bd10   fp ffffffc31ca5bdc0
do_execveat_common.isra.15+0x540/0x6a0:
  pc ffffff8008265430   sp ffffffc31ca5bdd0   fp ffffffc31ca5be10
SyS_execve+0x4c/0x60:
  pc ffffff8008265844   sp ffffffc31ca5be20   fp ffffffc31ca5be90
el0_svc_naked+0x34/0x38:
  pc ffffff80080838c0   sp ffffffc31ca5bea0   fp 0000000000000000
debug>   x0 0000000000000040   x1 000000006f3d6f38
  x2 0000000000000005   x3 0000000000006f38
  x4 0000000000000039   x5 ffffffc3e999bfc0
  x6 ffffffc3ee2aa000   x7 0000000000000000
  x8 0000000000000010   x9 000000046474e552
 x10 000000000001bc18  x11 000000000002bc18
 x12 000000000002bc18  x13 00000000000003e8
 x14 00000000000003e8  x15 ffffffffffffffff
 x16 30636d6d2f74736f  x17 30303a30636d6d2f
 x18 0000000000000001  x19 ffffff800aa57ac0
 x20 ffffff800aa560c0  x21 ffffffc3ffddde20
 x22 0000000000000000  x23 ffffff80081d6b98
 x24 0000000000000000  x25 ffffff800aa57ac0
 x26 0000000000001a00  x27 ffffffbf0db0fbc0
 x28 ffffffc3b74b9790  x29 ffffffc31ca5bae0
 x30 ffffff80081d7ed4   sp ffffffc31ca5bae0
  pc ffffff8008f54374 cpsr 004000c5 (EL1h)
 sp_el0   ffffffc3c8f28e00
 elr_el1  ffffff8008f54838
 spsr_el1 00400045
debug> ��
madisongh commented 4 years ago

From the stack trace on CPU0, looks like the processor is reporting a data abort exception. Maybe a misbehaving driver?

danielwangksu commented 4 years ago

From the stack trace on CPU0, looks like the processor is reporting a data abort exception. Maybe a misbehaving driver?

Thank you! It could be I will try to debug it.

ichergui commented 3 years ago

Hey @danielwangksu

Could you please give us an update about your issue ? Did you manage to fix it ?

danielwangksu commented 3 years ago

Hi @ichergui yes I fixed this issue. The problem was in kernel variables and drivers in my specific setup. Thank you.

ichergui commented 3 years ago

Hi @ichergui yes I fixed this issue. The problem was in kernel variables and drivers in my specific setup. Thank you.

Perfect