CTSRD-CHERI / cheribsd

FreeBSD adapted for CHERI-RISC-V and Arm Morello.
http://cheribsd.org
Other
164 stars 59 forks source link

Kernel panic when mounting a directory using nullfs (panic: Capability abort from kernel space: bounds violation) #1539

Closed kwitaszczyk closed 1 year ago

kwitaszczyk commented 1 year ago

When starting a Poudriere jail, GENERIC-MORELLO-PURECAP panics while mounting a directory using nullfs.

With DRM disabled:

login:   x0: 0xffff000000f78af0 [rwRW,0xffff000000f78af0-0xffff000000f78af8] (vop_open_vp_offsets + 0)
  x1: 0xffff00011161ea30 [rwxRW,0xffff00011161ea30-0xffff00011161ea50]
  x2: 0xffff000000f78b00 [rwRW,0xffff000000f78b00-0xffff000000f78b50] (vop_open_desc + 0)
  x3: 0xffff000000f795f0 [rwRW,0xffff000000f795f0-0xffff000000f79640] (vop_lock1_desc + 0)
  x4: 0xffff000000f79650 [rwRW,0xffff000000f79650-0xffff000000f796a0] (vop_unlock_desc + 0)
  x5: 0xffffa000007a4780 [rwxRW,0xffffa000007a4780-0xffffa000007a4900]
  x6: 0xffff000000e974c0 [rwRW,0xffff000000e974c0-0xffff000000e97530] (lock_class_mtx_sleep + 0)
  x7: 0xffff00000142d694 [rwRW,0xffff00000142d640-0xffff00000142d730] (w_locklistdata + 81d94)
  x8: 0x0000000000000010
  x9: 0x0000000000000001
 x10: 0x0000000000000002
 x11: 0x0000000000004100
 x12: 0x0000000000000004
 x13: 0x0000000001000004
 x14: 0x000000000000283d
 x15: 0x0000000000002af8
 x16: 0xffff00019e77d210 [rxR,0xffff00019e756000-0xffff00019e77e000] (__stop_set_sysctl_set + 520)
 x17: 0xffff00000061af1d [rxR,0x0000000000000000-0xffffffffffffffff] (sentry) (vdrop + 0)
 x18: 0xffff00011161e7d0 [rwxRW,0xffff00011161b000-0xffff000111620000]
 x19: 0xffff00011161ec50 [rwxRW,0xffff00011161ec50-0xffff00011161ec60]
 x20: 0x0000000000000000
 x21: 0xffff00011161ec60 [rwxRW,0xffff00011161ec50-0xffff00011161ec60]
 x22: 0x0000000000000000
 x23: 0xffff00019e77c3b0 [rwRW,0xffff00019e77c3b0-0xffff00019e77c8d0] (null_vnodeops + 0)
 x24: 0xffff000000f78b00 [rwRW,0xffff000000f78b00-0xffff000000f78b50] (vop_open_desc + 0)
 x25: 0xffff00011161e830 [rwxRW,0xffff00011161e830-0xffff00011161e930]
 x26: 0xffff00011161e930 [rwxRW,0xffff00011161e930-0xffff00011161ea30]
 x27: 0x0000000000000000
 x28: 0x0000000000000010
 x29: 0xffff00011161ea90 [rwxRW,0xffff00011161b000-0xffff000111620000]
 ddc: 0x0000000000000000
  sp: 0xffff00011161e7d0 [rwxRW,0xffff00011161b000-0xffff000111620000]
  lr: 0xffff00019e76a6ed [rxR,0xffff00019e756000-0xffff00019e77e000] (sentry) (null_open + 28)
 elr: 0xffff00019e76a26c [rxR,0xffff00019e756000-0xffff00019e77e000] (null_bypass + df)
spsr:         64400045
 far: ffff00011161ec60
 esr:         9600002a
panic: Capability abort from kernel space: bounds violation
cpuid = 3
time = 1668005042
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x38
vpanic() at vpanic+0x180
panic() at panic+0x40
cap_abort() at cap_abort+0x250
handle_el1h_sync() at handle_el1h_sync+0x10
--- exception, esr 0x9600002a
null_bypass() at null_bypass+0xe0
null_open() at null_open+0x24
VOP_OPEN_APV() at VOP_OPEN_APV+0x38
vn_open_vnode() at vn_open_vnode+0x160
vn_open_cred() at vn_open_cred+0x500
kern_openat() at kern_openat+0x258
do_el0_sync() at do_el0_sync+0x6d4
handle_el0_sync() at handle_el0_sync+0x2c
--- exception, esr 0x56000000
KDB: enter: panic
[ thread pid 1397 tid 100114 ]
Stopped at      kdb_enter+0x5f: undefined       c200027f
db> 

With DRM enabled:

login:   x0: 0xffff00000101b2f0 [rwRW,0xffff00000101b2f0-0xffff00000101b2f8] (vop_open_vp_offsets + 0)                                                                                                  
  x1: 0xffff00019eb33a30 [rwxRW,0xffff00019eb33a30-0xffff00019eb33a50]                                                                                                                                  
  x2: 0xffff00000101b300 [rwRW,0xffff00000101b300-0xffff00000101b350] (vop_open_desc + 0)                                                                                                               
  x3: 0xffff00000101bdf0 [rwRW,0xffff00000101bdf0-0xffff00000101be40] (vop_lock1_desc + 0)                                                                                                              
  x4: 0xffff00000101be50 [rwRW,0xffff00000101be50-0xffff00000101bea0] (vop_unlock_desc + 0)                                                                                                             
  x5: 0xffffa08011d9fd80 [rwxRW,0xffffa08011d9fd80-0xffffa08011d9ff00]                                                                                                                                  
  x6: 0xffff000000f38640 [rwRW,0xffff000000f38640-0xffff000000f386b0] (lock_class_mtx_sleep + 0)                                                                                                        
  x7: 0xffff0000014db4d4 [rwRW,0xffff0000014db480-0xffff0000014db570] (w_locklistdata + 7ffd4)                                                                                                          
  x8: 0x0000000000000010                                                                                                                                                                                
  x9: 0x0000000000000001                                                                                                                                                                                
 x10: 0x0000000000000002                                                                                                                                                                                
 x11: 0x0000000000004100                                                                                                                                                                                
 x12: 0x0000000000000004                                                                                                                                                                                
 x13: 0x0000000001000004                                                                                                                                                                                
 x14: 0x00000000000027fb                                                                                                                                                                                
 x15: 0x0000000000002af8                                                                                                                                                                                
 x16: 0xffff00019f97d1c0 [rxR,0xffff00019f956000-0xffff00019f97e000] (__stop_set_sysctl_set + 520)                                                                                                      
 x17: 0xffff0000006749e1 [rxR,0x0000000000000000-0xffffffffffffffff] (sentry) (vdrop + 0)                                                                                                               
 x18: 0xffff00019eb337d0 [rwxRW,0xffff00019eb30000-0xffff00019eb35000]                                                                                                                                  
 x19: 0xffff00019eb33c50 [rwxRW,0xffff00019eb33c50-0xffff00019eb33c60]                                                                                                                                  
 x20: 0x0000000000000000                                                                                                                                                                                
 x21: 0xffff00019eb33c60 [rwxRW,0xffff00019eb33c50-0xffff00019eb33c60]                                                                                                                                  
 x22: 0x0000000000000000                                                                                                                                                                                
 x23: 0xffff00019f97c360 [rwRW,0xffff00019f97c360-0xffff00019f97c880] (null_vnodeops + 0)                                                                                                               
 x24: 0xffff00000101b300 [rwRW,0xffff00000101b300-0xffff00000101b350] (vop_open_desc + 0)                                                                                                               
 x25: 0xffff00019eb33830 [rwxRW,0xffff00019eb33830-0xffff00019eb33930]                                                                                                                                  
 x26: 0xffff00019eb33930 [rwxRW,0xffff00019eb33930-0xffff00019eb33a30]                                                                                                                                  
 x27: 0x0000000000000000                                                                                                                                                                                
 x28: 0x0000000000000010                                                                                                                                                                                
 x29: 0xffff00019eb33a90 [rwxRW,0xffff00019eb30000-0xffff00019eb35000]                                                                                                                                  
 ddc: 0x0000000000000000                                                                                                                                                                                
  sp: 0xffff00019eb337d0 [rwxRW,0xffff00019eb30000-0xffff00019eb35000]                                                                                                                                  
  lr: 0xffff00019f96a69d [rxR,0xffff00019f956000-0xffff00019f97e000] (sentry) (null_open + 28)                                                                                                          
 elr: 0xffff00019f96a21c [rxR,0xffff00019f956000-0xffff00019f97e000] (null_bypass + df)                                                                                                                 
spsr:         64400045                                                                                                                                                                                  
 far: ffff00019eb33c60                                                                                                                                                                                  
 esr:         9600002a                                                                                                                                                                                  
WARNING !list_empty(&lock->head) failed at /usr/src/sys/dev/drm/core/drm_modeset_lock.c:268                                                                                                             
WARNING !drm_modeset_is_locked(&crtc->mutex) failed at /usr/src/sys/dev/drm/core/drm_atomic_helper.c:617                                                                                                
WARNING !drm_modeset_is_locked(&dev->mode_config.connection_mutex) failed at /usr/src/sys/dev/drm/core/drm_atomic_helper.c:667                                                                          
WARNING !drm_modeset_is_locked(&plane->mutex) failed at /usr/src/sys/dev/drm/core/drm_atomic_helper.c:892                                                                                               
WARNING !drm_modeset_is_locked(&plane->mutex) failed at /usr/src/sys/dev/drm/core/drm_atomic_helper.c:892                                                                                               
<3>[drm: 0xffff0000001902f9] *ERROR* [CRTC:33:crtc-0] hw_done timed out                                                                                                                                 
<3>[drm: 0xffff000000190335] *ERROR* [CRTC:33:crtc-0] flip_done timed out                                                                                                                               
<3>[drm: 0xffff0000001903e1] *ERROR* [CONNECTOR:35:HDMI-A-1] hw_done timed out                                                                                                                          
<3>[drm: 0xffff00000019041d] *ERROR* [CONNECTOR:35:HDMI-A-1] flip_done timed out                                                                                                                        
<3>[drm: 0xffff0000001904d1] *ERROR* [PLANE:31:plane-0] hw_done timed out                                                                                                                               
<3>[drm: 0xffff00000019050d] *ERROR* [PLANE:31:plane-0] flip_done timed out                                                                                                                             
<3>[drm: 0xffff0000001904d1] *ERROR* [PLANE:32:plane-1] hw_done timed out                                                                                                                               
<3>[drm: 0xffff00000019050d] *ERROR* [PLANE:32:plane-1] flip_done timed out                                                                                                                             
panic: running but not TDS_RUNNING                                                                                                                                                                      
cpuid = 1                                                                                                                                                                                               
time = 1667919995                                                                                                                                                                                       
KDB: stack backtrace:                                                                                                                                                                                   
db_trace_self() at db_trace_self                                                                                                                                                                        
db_trace_self_wrapper() at db_trace_self_wrapper+0x38                                                                                                                                                   
vpanic() at vpanic+0x180                                                                                                                                                                                
panic() at panic+0x40                                                                                                                                                                                   
sleepq_switch() at sleepq_switch+0x1e0                                                                                                                                                                  
sleepq_timedwait() at sleepq_timedwait+0x48                                                                                                                                                             
drmcompat_add_to_sleepqueue() at drmcompat_add_to_sleepqueue+0xa0                                                                                                                                       
drmcompat_wait_event_common() at drmcompat_wait_event_common+0x13c                                                                                                                                      
drm_atomic_helper_wait_for_vblanks() at drm_atomic_helper_wait_for_vblanks+0x240                                                                                                                        
drm_atomic_helper_commit_tail_rpm() at drm_atomic_helper_commit_tail_rpm+0x4c                                                                                                                           
commit_tail() at commit_tail+0x154                                                                                                                                                                      
drm_atomic_helper_commit() at drm_atomic_helper_commit+0x36c                                                                                                                                            
restore_fbdev_mode_atomic() at restore_fbdev_mode_atomic+0x1fc                                                                                                                                          
drm_fb_helper_restore_fbdev_mode_unlocked() at drm_fb_helper_restore_fbdev_mode_unlocked+0x94                                                                                                           
vt_kms_postswitch() at vt_kms_postswitch+0x80                                                                                                                                                           
vt_window_switch() at vt_window_switch+0x1d8                                                                                                                                                            
vtterm_cngrab() at vtterm_cngrab+0x38                                                                                                                                                                   
cngrab() at cngrab+0x34                                                                                                                                                                                 
vpanic() at vpanic+0xf4                                                                                                                                                                                 
panic() at panic+0x40                                                                                                                                                                                   
cap_abort() at cap_abort+0x250                                                                                                                                                                          
handle_el1h_sync() at handle_el1h_sync+0x10                                                                                                                                                             
--- exception, esr 0x9600002a                                                                                                                                                                           
null_bypass() at null_bypass+0xe0                                                                                                                                                                       
null_open() at null_open+0x24                                                                                                                                                                           
VOP_OPEN_APV() at VOP_OPEN_APV+0x38                                                                                                                                                                     
vn_open_vnode() at vn_open_vnode+0x160                                                                                                                                                                  
vn_open_cred() at vn_open_cred+0x500                                                                                                                                                                    
kern_openat() at kern_openat+0x258                                                                                                                                                                      
do_el0_sync() at do_el0_sync+0x6d4                                                                                                                                                                      
handle_el0_sync() at handle_el0_sync+0x2c                                                                                                                                                               
--- exception, esr 0x56000000                                                                                                                                                                           
KDB: enter: panic                                                                                                                                                                                       
[ thread pid 20804 tid 100149 ]                                                                                                                                                                         
jrtc27 commented 1 year ago

The printed panic isn’t really the root cause. The problem is the cap_abort that called panic; what you see is the nested panic because the DRM code is not very robust and doesn’t handle panics properly.

kwitaszczyk commented 1 year ago

@jrtc27 Thanks. I disabled DRM, tested it again and I can confirm it's an issue related to nullfs.