erigontech / erigon

Ethereum implementation on the efficiency frontier https://erigon.gitbook.io
GNU Lesser General Public License v3.0
3.14k stars 1.12k forks source link

Crash - maybe a bug, maybe nothing at all... #3104

Closed avocadochicken closed 2 years ago

avocadochicken commented 2 years ago

System information

erigon version 2021.10.3-alpha 5.10.79-1-MANJARO #1 SMP PREEMPT Fri Nov 12 20:26:09 UTC 2021 x86_64 GNU/Linux

Backtrace

https://gist.githubusercontent.com/avocadochicken/54f0914b11c5a1b037da510ca7956360/raw/fdd0cc941beb8c43ded7cd00a6b19f298bd4b5c8/gistfile1.txt

Kernel

It killed the entire drive. I dont know if it was Erigon or the HW / drive itself...

[261077.109886] nvme nvme0: I/O 259 QID 27 timeout, aborting                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
[261078.629894] nvme nvme0: I/O 872 QID 21 timeout, aborting                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
[261107.829035] nvme nvme0: I/O 259 QID 27 timeout, reset controller                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
[261138.548445] nvme nvme0: I/O 24 QID 0 timeout, reset controller                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
[261199.769796] nvme nvme0: Device not ready; aborting reset, CSTS=0x1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
[261199.783697] nvme nvme0: Abort status: 0x371                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
[261199.783698] nvme nvme0: Abort status: 0x371                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
[261230.328950] nvme nvme0: Device not ready; aborting reset, CSTS=0x1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
[261230.328953] nvme nvme0: Removing after probe failure status: -19                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
[261235.825742] INFO: task kcompactd0:250 blocked for more than 122 seconds.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
[261235.825744]       Tainted: P        W  OE     5.10.79-1-MANJARO #1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
[261235.825745] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
[261235.825746] task:kcompactd0      state:D stack:    0 pid:  250 ppid:     2 flags:0x00004000                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
[261235.825749] Call Trace:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                
[261235.825755]  __schedule+0x288/0x800                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
[261235.825758]  ? out_of_line_wait_on_bit_lock+0xb0/0xb0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
[261235.825760]  schedule+0x5b/0xc0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
[261235.825761]  io_schedule+0x42/0x70                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
[261235.825763]  bit_wait_io+0xd/0x50                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      
[261235.825764]  __wait_on_bit_lock+0x5d/0xa0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
[261235.825766]  out_of_line_wait_on_bit_lock+0x92/0xb0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
[261235.825769]  ? var_wake_function+0x20/0x20                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
[261235.825772]  __buffer_migrate_page.part.0+0xab/0x2b0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
[261235.825774]  move_to_new_page+0xa1/0x2f0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
[261235.825776]  ? page_counter_uncharge+0x36/0x50                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
[261235.825777]  ? uncharge_batch+0xcf/0x140                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
[261235.825779]  ? free_unref_page_commit+0x98/0x120                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
[261235.825781]  migrate_pages+0x9c1/0xe50                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
[261235.825784]  ? isolate_freepages_block+0x410/0x410                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
[261235.825785]  ? split_map_pages+0x170/0x170                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
[261235.825786]  ? migrate_page_states+0x290/0x290                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
[261235.825788]  compact_zone+0x606/0xdb0                                                                                                                                                                                             
[261235.825791]  ? finish_task_switch+0x75/0x250                                                                                                                                                                                      
[261235.825793]  proactive_compact_node+0x8f/0xe0                                                                                                                                                                                     
[261235.825796]  kcompactd+0x317/0x390                                                                                                                                                                                                
[261235.825798]  ? add_wait_queue_exclusive+0x70/0x70                                                                                                                                                                                 
[261235.825799]  ? kcompactd_do_work+0x240/0x240                                                                                                                                                                                      
[261235.825802]  kthread+0x133/0x150                                                                                                                                                                                                  
[261235.825803]  ? kthread_associate_blkcg+0xc0/0xc0                                                                                                                                                                                  
[261235.825806]  ret_from_fork+0x22/0x30                                                                                                                                                                                              
[261235.825841] INFO: task jbd2/nvme0n1p1-:13937 blocked for more than 122 seconds.                                                                                                                                                   
[261235.825842]       Tainted: P        W  OE     5.10.79-1-MANJARO #1                                                                                                                                                                
[261235.825843] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.                                                                                                                                             
[261235.825843] task:jbd2/nvme0n1p1- state:D stack:    0 pid:13937 ppid:     2 flags:0x00004080                                                                                                                                       
[261235.825845] Call Trace:                                                                                                                                                                                                                                                                                                                                               
[261235.825847]  __schedule+0x288/0x800                                                                                                                                                                                                                                                                                                                                   
[261235.825849]  ? out_of_line_wait_on_bit_lock+0xb0/0xb0                                                                                                                                                                                                                                                                                                                 
[261235.825850]  schedule+0x5b/0xc0                                                                                                                                                                                                                                                                                                                                       
[261235.825851]  io_schedule+0x42/0x70                                                                                                                                                                                                
[261235.825852]  bit_wait_io+0xd/0x50                                                                                                                                                                                                                                                                                                                                     
[261235.825853]  __wait_on_bit+0x2a/0x90                                                                                                                                                                                                                                                                                                                                  
[261235.825855]  out_of_line_wait_on_bit+0x92/0xb0                                                                                                                                                                                                                                                                                                                        
[261235.825856]  ? var_wake_function+0x20/0x20                                                                                                                                                                                                                                                                                                                            
[261235.825861]  jbd2_journal_commit_transaction+0x1304/0x1d00 [jbd2]                                                                                                                                                                 
[261235.825864]  ? sugov_get_util+0x60/0x60                                                                                                                                                                                           
[261235.825870]  kjournald2+0xaf/0x280 [jbd2]                                                                                                                                                                                         
[261235.825872]  ? add_wait_queue_exclusive+0x70/0x70                                                                                                                                                                                 
[261235.825876]  ? jbd2_journal_release_jbd_inode+0x150/0x150 [jbd2]                                                                                                                                                                  
[261235.825877]  kthread+0x133/0x150                                                                                                                                                                                                  
[261235.825879]  ? kthread_associate_blkcg+0xc0/0xc0                                                                                                                                                                                  
[261235.825880]  ret_from_fork+0x22/0x30                                                                                                                                                                                              
[261235.825892] INFO: task erigon:15253 blocked for more than 122 seconds.                                                                                                                                                            
[261235.825892]       Tainted: P        W  OE     5.10.79-1-MANJARO #1                                                                                                                                                                
[261235.825893] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.                                                                                                                                             
[261235.825894] task:erigon          state:D stack:    0 pid:15253 ppid: 14757 flags:0x00000080                                                                                                                                       
[261235.825895] Call Trace:                                                                                                                                                                                                           
[261235.825897]  __schedule+0x288/0x800                                                                                                                                                                                               
[261235.825898]  ? out_of_line_wait_on_bit_lock+0xb0/0xb0                                                                                                                                                                             
[261235.825899]  schedule+0x5b/0xc0                                                                                                                                                                                                                                                                                                                                       
[261235.825900]  io_schedule+0x42/0x70                                                                                                                                                                                                                                                                                                                                    
[261235.825901]  bit_wait_io+0xd/0x50                                                                                                                                                
[261235.825903]  __wait_on_bit+0x2a/0x90                                                                                                                                             
[261235.825904]  out_of_line_wait_on_bit+0x92/0xb0                                                                                                                                   
[261235.825905]  ? var_wake_function+0x20/0x20                                                                                                                                       
[261235.825909]  do_get_write_access+0x274/0x3e0 [jbd2]                                                                                                                              
[261235.825913]  jbd2_journal_get_write_access+0x4f/0x80 [jbd2]                                                                
[261235.825923]  __ext4_journal_get_write_access+0x72/0x120 [ext4]                                                             
[261235.825936]  ext4_reserve_inode_write+0x7f/0xb0 [ext4]                                                                     
[261235.825947]  __ext4_mark_inode_dirty+0x52/0x220 [ext4]                                                                                                    
[261235.825956]  ? __ext4_journal_start_sb+0x9f/0x110 [ext4]                                                                                                  
[261235.825966]  ext4_dirty_inode+0x5f/0x80 [ext4]                                                                                                                                   
[261235.825968]  __mark_inode_dirty+0x1b5/0x360                                                                                                                                      
[261235.825970]  generic_update_time+0x71/0xd0                                                                                                                                       
[261235.825972]  file_update_time+0x123/0x140                                  
[261235.825981]  ext4_page_mkwrite+0x93/0x670 [ext4]                           
[261235.825983]  ? futex_wake+0x14d/0x180                                      
[261235.825985]  do_page_mkwrite+0x51/0xd0                                     
[261235.825987]  do_wp_page+0x240/0x2f0                                        
[261235.825989]  handle_mm_fault+0x120c/0x1a50                                 
[261235.825992]  do_user_addr_fault+0x1e6/0x420                                                                                                                                      
[261235.825995]  exc_page_fault+0x64/0x160                                                                                                                                           
[261235.825997]  ? asm_exc_page_fault+0x8/0x30                                                                                                                                       
[261235.825999]  asm_exc_page_fault+0x1e/0x30                                                                                                                                        
[261235.826001] RIP: 0033:0x7fe5871a8dee                                                                                                                                             
[261235.826002] RSP: 002b:00007fe55ee20bc0 EFLAGS: 00010246                                                                                                                          
[261235.826003] RAX: 0000000000000000 RBX: 0000000000003b95 RCX: 0000000000020000                                                                                                    
[261235.826004] RDX: 00000000000000b2 RSI: 00007fe55ee21640 RDI: 00007fe5084ed100                                                                                                                                                                                                                                                                                         
[261235.826005] RBP: 00007fe5084ed100 R08: 0000000000000010 R09: 0000000000000002                                                                                                                                                                                                                                                                                         
[261235.826006] R10: 00007fe555821ba0 R11: 0000000000000000 R12: 00007fe5084ed100                                                                                                    
[261235.826006] R13: 0000000000000000 R14: 00007fe5084ed110 R15: 00007fe55ee21640  
AskAlexSharov commented 2 years ago

@avocadochicken Eeigon’s db doesn’t read/write to disk, it only using mmap/msync syscals, and Linux OS does read/write. So, likely you have hardware failure or miss-configuration.