foss-for-synopsys-dwc-arc-processors / linux

Helpful resources for users & developers of Linux kernel for ARC
22 stars 13 forks source link

[arc64] preemptible kernel (with MMUv6) hits machine check with stress testing #43

Closed vineetgarc closed 3 years ago

vineetgarc commented 3 years ago

Stress testing the MMUv6 support (repeated fork/execve/exit) triggers a machine check. This only happens with CONFIG_PREEMPT kernel.

Linux version 5.6.0-00171-ge54fdaaa70b9 (vineetg@vineetg-Latitude-7400) (gcc version 10.1.1 20200701 (GCC)) #76 Mon Jan 4 11:34:42 PST 2021
Memory @ 80000000 [512M] 
Memory @ 100000000 [1024M] Not used
OF: fdt: Machine model: snps,zebu_hs
earlycon: uart8250 at MMIO32 0x00000000f0000000 (options '115200n8')
printk: bootconsole [uart8250] enabled

IDENTITY    : ARCVER [0x0] ARCNUM [0x0] CHIPID [0xffff]
processor [0]   : HS58 (ARC32 ARCv3 ISA) 
ISA Extn    : mpy[opt 2] 
MMU [v6]    : MMU48 hwalk 4 levels, 4k PAGE, JTLB 256 uD/I 8/4
I-Cache     : 64K, 4way/set, 64B Line, VIPT aliasing
D-Cache     : 64K, 2way/set, 64B Line, PIPT
Timers      : Timer0 Timer1 
Vector Table    : 0x80000000
Extn [CCM]  : DCCM @ 810782b8, 0 KB / ICCM: @ 20, 0 KB
archs-intc  : 16 priority levels (default 1) FIRQ (not used)
On node 0 totalpages: 131072
...
...
CONFIG_INITRAMFS_SOURCE="~/arc/RAMFS/arc64/ramfs-arc64-201210-gcc-binutils-amo-support-not-used"
***********************************************************************
            Welcome to ARCLinux
***********************************************************************
[ARCLinux]# while true; do count=$((count+1)); echo $count;  ls;  done

237
bin          etc          lib          sbin         tmp          version.txt
debug        home         proc         sd           usr
dev          init         run          sys          var
238
bin          etc          lib          sbin         tmp          version.txt
debug        home         proc         sd           usr
dev          init         run          sys          var
239
bin          etc          lib          sbin         tmp          version.txt
debug        home         proc         sd           usr
dev          init         run          sys          var

Unhandled Machine Check Exception
Path: (null)
CPU: 0 PID: 297 Comm: ls Not tainted 5.6.0-00171-ge54fdaaa70b9 #80
Machine Check (Double Fault)
ECR: 0x00030000 EFA: 0x80d837fe ERET: 0x80d837fe
STAT32: 0x00081022 [  K   AE ]   BTA: 0x80cf6d78
 SP: 0x9f27bd08  FP: 0x9f27bde0 BLK: free_pgd_range+0x3e2/0x4cc
r00: 0x9ffca0c0 r01: 0x00000000 r02: 0x9f7bb000
r03: 0x9f800000 r04: 0x9ffca0f4 r05: 0x9ffc9d48
r06: 0x00000001 r07: 0xffffffffffffffff r08: 0x9f20a738
r09: 0x9f233478 r10: 0x00000000 r11: 0x201046e4
r12: 0x00000031 r13: 0x00000000 r14: 0xffffffffffffffff
r15: 0x60000000 r16: 0x60000000 r17: 0x00200000
r18: 0x81145608 r19: 0x000003f0 r20: 0x9f283010
r21: 0x00001000 r22: 0x00000000 r23: 0xfffffffffe000000
r24: 0x9f275800 r25: 0x00001000

Stack Trace:
  free_pages+0x26/0x48
  free_pgd_range+0x3e2/0x4cc
  exit_mmap+0x76/0x184
  mmput+0x2a/0xbc
  do_exit+0x200/0x90c
  do_group_exit+0x2a/0xd0
  sys_exit_group+0x12/0x14
vineetgarc commented 3 years ago

The issue is a race condition in exit code path

do_exit
  exit_mm
    exit_mm_release(current->mm)
       mm_release
           deactivate_mm   <-- RTP0 set to fallback swapped pgd
                                           (since task page tables will be freed later including kernel mapping)

--> IRQ taken
     preempt_schedule_irq
    context_switch
         switch_mm  <-- reprograms RTP0 to task’s pgd (loosing the fallback pgd)
         switch_to
<--  IRQ resumes in exit_mm  (seems like context switch resumes in same task which is a mystery)

    mmput
       __mmput
        exit_mmap(old_mm)
             arch_exit_mmap(old_mm)
             unmap_vmas
             free_pgtables
                free_pgd_range    <--in-use task pgd table tree is freed, incl kernel mapping
                                                                  This is NOK but TLB entries keep things going
             tlb_finish_mmu
                tlb_flush
                tlb_flush_mm    <-- Nail in the coffin: TLB entries flushed.  Kernel can’t execute anymore
vineetgarc commented 3 years ago

To plug the race, we now do this back in arch_exit_mmap() - this time we detect whether this is exevce code path or exit (mm== NULL) and only do this for exit.

Fix pushed ARCv3: mm: machine check with CONFIG_PREEMPT: switch back to arch_exit...

A small race still exists, a fully robust solution will require tinkering with USER_PGTABLES_CEILING.