copy / v86

x86 PC emulator and x86-to-wasm JIT, running in the browser
https://copy.sh/v86/
BSD 2-Clause "Simplified" License
19.76k stars 1.39k forks source link

SIGSEGV when run a program in gdb #278

Open hackeris opened 5 years ago

hackeris commented 5 years ago

I want to debug my program with gdb on v86. But after just loaded my program to gdb and started, I got "Program received signal SIGSEGV, Segmentation fault".

And I tried to toggle breakpoint, the program could be paused and continue at first few breakpoints, but I still got "Program received signal SIGSEGV, Segmentation fault" if I continue later.

The program I loaded can be run correctly without gdb. Is there any unimplemented feature or bug in vm? Or any option missing while building image?

I'm working on debugging v86 in my browser to find it, but it seems not easy. Glad to contribute if it is caused by something unimplemented or bug.

Steps here:

  1. Start gdb with target program loaded

I use yasm for demonstration.

cd /usr/bin
gdb ./yasm
  1. Toggle breakpoint at 0x0 and run to pause program at first instruction.
(gdb) b *0                                                                      
Breakpoint 1 at 0x0                                                             
(gdb) r                                                                         
Starting program: /usr/bin/yasm                                                 
Warning:                                                                        
Cannot insert breakpoint 1.                                                     
Cannot access memory at address 0x0

(gdb) disas                                                                     
Dump of assembler code for function _start:                                     
=> 0xb7f53237 <+0>:     call   0xb7f52f50 <_dl_start>                           
   0xb7f5323c <+5>:     mov    %eax,%edi                                        
   0xb7f5323e <+7>:     call   0xb7f53243 <_start+12>                           
   0xb7f53243 <+12>:    pop    %ebx                                             
   0xb7f53244 <+13>:    add    $0x2db1,%ebx                                     
   0xb7f5324a <+19>:    mov    0x48(%ebx),%eax  
  1. Delete breakpoint above and toggle new breakpoints.

The first instruction in _start is a call to _dl_start, so I toggle breakpoints at first two instructions of _dl_start.

(gdb) i b                                                                       
Num     Type           Disp Enb Address    What                                 
1       breakpoint     keep y   0x00000000                                      
(gdb) delete 1                                                                  
(gdb) b *(_dl_start)                                                            
Breakpoint 2 at 0xb7fb6f50                                                      
(gdb) b *(_dl_start+1)                                                          
Breakpoint 3 at 0xb7fb6f51                                                      
(gdb) i b                                                                       
Num     Type           Disp Enb Address    What                                 
2       breakpoint     keep y   0xb7fb6f50 <_dl_start>                          
3       breakpoint     keep y   0xb7fb6f51 <_dl_start+1>
  1. Continue

The program can be paused at breakpoint 2 (_dl_start + 0) like this.

(gdb) c                                                                         
Continuing.                                                                     

Breakpoint 2, 0xb7f8af50 in _dl_start () from /lib/ld-uClibc.so.0
(gdb) disas
Dump of assembler code for function _dl_start:                                  
=> 0xb7f8af50 <+0>:     push   %ebp                                             
   0xb7f8af51 <+1>:     mov    %esp,%ebp                                        
   0xb7f8af53 <+3>:     push   %edi                                             
   0xb7f8af54 <+4>:     push   %esi                                             
   0xb7f8af55 <+5>:     push   %ebx                                             
   0xb7f8af56 <+6>:     sub    $0x1bc,%esp                                      
   0xb7f8af5c <+12>:    call   0xb7f8b26b <__x86.get_pc_thunk.bx>

And there is a break point at (_dl_start + 1), it is expected to be paused at breakpoint 3 (_dl_start + 1) if I continue again.

  1. Continue again.

It says SIGSEGV like this.

(gdb) c                                                                         
Continuing.                                                                     

Program received signal SIGSEGV, Segmentation fault.                            
0x42f8b26b in ?? ()                                                             
(gdb) i s                                                                       
#0  0x42f8b26b in ?? ()                                                         
#1  0x08049c95 in ?? ()                                                         
#2  0xb7f8b23c in _start () from /lib/ld-uClibc.so.0    

Image here:

https://pan.baidu.com/s/1Udjtb9_SAkIIW9zg09jHsw Password: heuf

kenorb commented 5 years ago

Why do you breakpoint on 0x0? It's not a valid memory address.

(gdb) b *0                                                                      
Breakpoint 1 at 0x0                                                             
(gdb) r                                                                         
Starting program: /usr/bin/yasm                                                 
Warning:                                                                        
Cannot insert breakpoint 1.                                                     
Cannot access memory at address 0x0

If you'd like to breakpoint at the entry point, see: How to stop debugger right after the execution?

hackeris commented 5 years ago

It seems that breakpoint at the entry point does not work.

# readelf  -h ./yasm  | grep 'Entry point'                                      
  Entry point address:               0x8049c82                                  
# gdb -q ./yasm                                                                 
Reading symbols from ./yasm...(no debugging symbols found)...done.              
(gdb) set disable-randomization on                                              
(gdb) b *0x8049c82                                                              
Breakpoint 1 at 0x8049c82                                                       
(gdb) r                                                                         
Starting program: /usr/bin/yasm                                                 

Program received signal SIGSEGV, Segmentation fault.                            
0x67f5288e in ?? ()                                                             
(gdb) i s                                                                       
#0  0x67f5288e in ?? ()                                                         
#1  0x08048b20 in ?? ()                                                         
#2  0xb7f870ef in ?? () from /lib//libc.so.0                                    
#3  0xb7ffb5df in _dl_get_ready_to_run () from /lib/ld-uClibc.so.0              
#4  0xb7ffc162 in _dl_start () from /lib/ld-uClibc.so.0                         
#5  0xb7ffc23c in _start () from /lib/ld-uClibc.so.0                            
(gdb) i f                                                                       
Stack level 0, frame at 0xbffff9c4:                                             
 eip = 0x67f5288e; saved eip = 0x8048b20                                        
 called by frame at 0xbffffa30                                                  
 Arglist at 0xbffff9bc, args:                                                   
 Locals at 0xbffff9bc, Previous frame's sp is 0xbffff9c4                        
 Saved registers:                                                               
  eip at 0xbffff9c0                                                             
(gdb) x/5i $eip                                                                 
=> 0x67f5288e:  Cannot access memory at address 0x67f5288e  
ideal commented 5 years ago

Maybe that is because of _dl_get_ready_to_run() in uClibc ?

copy commented 5 years ago

How does gdb implement break points? v86 ignores the trap flag, so execution continues normally, but I'm not sure if that's the problem here. The single byte trap instruction (INT3) is implemented in v86. Debug registers are not implemented.

I suspect gdb uses debug registers. Could you check if that's the case? If it is, maybe we can convince it to use a different mechanism by clearing certain bits in cpuid or faulting when these registers are accessed. Afaik, I implemented read/write support for these registers because Haiku uses them as extra data registers.

hackeris commented 5 years ago

Well. I'll check it.

hackeris commented 5 years ago

@copy Is there a build environment and options for this image? If not, could you please provide one? I want to build and test gdb with the same environment and options as yours. This may make it easier to find what caused SIGSEGV in gdb.

I'm using browser-vm but it seems that there are many differences with your image which will cause Unknown msr: 0x00000140 at booting. And these differences may interfere with the work.

copy commented 5 years ago

@hackeris It's a relatively old buildroot image. I don't have the config file for it any more. I'm pretty sure buildroot still supports old kernels, so you could modify the browser-vm image to your needs. I don't think this particular msr makes a difference though.

hackeris commented 5 years ago

How does gdb implement break points? v86 ignores the trap flag, so execution continues normally, but I'm not sure if that's the problem here. The single byte trap instruction (INT3) is implemented in v86. Debug registers are not implemented.

I suspect gdb uses debug registers. Could you check if that's the case? If it is, maybe we can convince it to use a different mechanism by clearing certain bits in cpuid or faulting when these registers are accessed. Afaik, I implemented read/write support for these registers because Haiku uses them as extra data registers.

There is another simulator(jslinux) which declares single stepping is not support. And I did not see any SIGSEGV while breaking or continuing my program with gdb in it. Maybe the problem in v86 is not about trap flag but debug registers?

hackeris commented 5 years ago

I'v ran program yasm with Valgrind and gdb in vm like this. It seems that everything works well including breakpoints, continuing and single stepping. What Valgrind does is to simulate every single instruction of the target program. So there is some implement feature or bug of cpu in v86.

And when I debug my program written in assembly and compiled with yasm, what strange is that it will trigger SIGSEGV at instruction add eax, ebx. Here are my steps.

  1. Save code as test.asm
global _start                                                                   

_add:                                                                           
  push ebp                                                                      
  mov ebp, esp                                                                  
  mov eax, [ebp + 8]                                                            
  mov ebx, [ebp + 0xc]                                                          
  add eax, ebx                                                                  
  pop ebp                                                                       
  ret                                                                           

_start:                                                                         
  push 2                                                                        
  push 1                                                                        
  call _add                                                                     
  add esp, 8                                                                    

  mov ebx, eax                                                                  
  mov eax,1                                                                     
  int 0x80
  1. Build
yasm -g dwarf2 -f elf test.asm
ld test.o -o test
  1. Debug program test with gdb
gdb ./test
(gdb) set disassembly-flavor intel
  1. Toggle breakpoint and continue.
(gdb) b *(_add)                                                                 
Breakpoint 1 at 0x8048060: file test.asm, line 4.                                                                                                              
(gdb) r                                                                         
Starting program: /root/test                                                    

Breakpoint 1, _add () at test.asm:4                                             
4         push ebp
(gdb) c                                                                         
Continuing.                                                                     

Program received signal SIGSEGV, Segmentation fault.                            
0x0804806a in _add () at test.asm:8                                             
8         add eax, ebx  

The problem of SIGSEGV in gdb was marked as bug days ago. Is there any detail more about this bug? Or any strange case?

undefined-moe commented 2 years ago

I've also met this problem for musl libc gdb version 10.2, even with no breakpoints enabled. (image built just now with https://github.com/wokwi/browser-vm-gdb , simply switch libc to musl)

compile this code in alpine linux g++ g++ a.cpp -o a -g and then copy the binary to the vm:

#include<iostream>
using namespace std;
int main(){
  int a,b;
  cin>>a>>b;
  cout<<a+b<<endl;
  return 0;
}

image

Meanwhile, I've noticed that gdb running on wokwi.com is functioning normally (see description at https://github.com/wokwi/web-gdb ), and breakpoints just work fine. So I don't think its a problem related to breakpoints.

and then I tried to debug busybox binary with the system image itself, which also throws a SIGSEGV error. So it is probably not a problem related to dynamic linked libraries.

I haven't confirmed if it is a problem of some specific gdb version or problem with browser-vm builder itself. I'll build some other images later.

undefined-moe commented 2 years ago

When i trying to find which syscall crashed, I found a really interesting thing: It... runs successfully...

/ # gdb /bin/echo
GNU gdb (GDB) 10.2
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "i686-buildroot-linux-musl".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /bin/echo...
(No debugging symbols found in /bin/echo)
(gdb) r 1
Starting program: /bin/echo 1

Program received signal SIGSEGV, Segmentation fault.
0xaff0761f in ?? ()
(gdb) b *0
Breakpoint 1 at 0x0
(gdb) r 1
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /bin/echo 1
Warning:
Cannot insert breakpoint 1.
Cannot access memory at address 0x0

(gdb) delete 0
warning: bad breakpoint number at or near '0'
(gdb) delete 1
(gdb) ni
0xb7f2526b in ?? () from /lib/ld-musl-i386.so.1
(gdb) ni
0xb7eed0f0 in ?? () from /lib/ld-musl-i386.so.1
(gdb) ni
0xb7ed2a59 in ?? ()
(gdb) ni
0xb7ed2a59 in ?? ()
(gdb) ni
0xb7ed2a59 in ?? ()
(gdb) ni
0xb7ed2a59 in ?? ()
(gdb) ni
0xb7ed2a59 in ?? ()
(gdb) ni
1
0xb7f25336 in ?? () from /lib/ld-musl-i386.so.1
(gdb) ni
[Inferior 1 (process 704) exited normally]
(gdb) ni
The program is not being run.

and also the same for uclibc:

/ # gdb /bin/echo
GNU gdb (GDB) 10.1
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "i686-buildroot-linux-uclibc".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /bin/echo...
(No debugging symbols found in /bin/echo)
(gdb) r 1
Starting program: /bin/echo 1

Program received signal SIGSEGV, Segmentation fault.
0xafec7b97 in ?? ()
(gdb) b *0
Breakpoint 1 at 0x0
(gdb) r 1
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /bin/echo 1
Warning:
Cannot insert breakpoint 1.
Cannot access memory at address 0x0

(gdb) delete 1
(gdb) ni
0xb7feba36 in ?? () from /lib/ld-uClibc.so.0
(gdb) ni
0xb7fea082 in __libc_i386_syscall6 () from /lib/ld-uClibc.so.0
(gdb) ni
0xb7febd79 in _dl_load_elf_shared_library () from /lib/ld-uClibc.so.0
(gdb) ni
0xb7febdbf in _dl_load_elf_shared_library () from /lib/ld-uClibc.so.0
(gdb) ni
0xb7fea082 in __libc_i386_syscall6 () from /lib/ld-uClibc.so.0
(gdb) ni
0xb7febf2b in _dl_load_elf_shared_library () from /lib/ld-uClibc.so.0
(gdb) ni
0xb7fea082 in __libc_i386_syscall6 () from /lib/ld-uClibc.so.0
(gdb) ni
0xb7fea082 in __libc_i386_syscall6 () from /lib/ld-uClibc.so.0
(gdb) ni
0xb7fea082 in __libc_i386_syscall6 () from /lib/ld-uClibc.so.0
(gdb) ni
0xb7fea082 in __libc_i386_syscall6 () from /lib/ld-uClibc.so.0
(gdb) ni
0xb7fea082 in __libc_i386_syscall6 () from /lib/ld-uClibc.so.0
(gdb) ni
0xb7fea082 in __libc_i386_syscall6 () from /lib/ld-uClibc.so.0
(gdb) ni
0xb7fec634 in _dl_load_elf_shared_library () from /lib/ld-uClibc.so.0
(gdb) ni
0xb7fec883 in _dl_load_elf_shared_library () from /lib/ld-uClibc.so.0
(gdb) ni
0xb7febb91 in ?? () from /lib/ld-uClibc.so.0
(gdb) ni
0xb7fea082 in __libc_i386_syscall6 () from /lib/ld-uClibc.so.0
(gdb) ni
0xb7feb50b in init_tls () from /lib/ld-uClibc.so.0
(gdb) ni
0xb7feb975 in _dl_protect_relro () from /lib/ld-uClibc.so.0
(gdb) ni
0xb7feb975 in _dl_protect_relro () from /lib/ld-uClibc.so.0
(gdb) ni
0xb7feb975 in _dl_protect_relro () from /lib/ld-uClibc.so.0
(gdb) ni
0xb7f9cedf in ?? ()
(gdb) ni
0xb7f9cf13 in ?? ()
(gdb) ni
0xb7f5b5cc in ?? ()
(gdb) ni
0xb7f5b5cc in ?? ()
(gdb) ni
0xb7f9cfd2 in ?? ()
(gdb) ni
0xb7f5d46e in ?? ()
(gdb) 
0xb7f59f2f in ?? ()
(gdb) ni
0xb7f5d4b7 in ?? ()
(gdb) ni
0xb7f5d4b7 in ?? ()
(gdb) ni
0xb7f5a5d3 in ?? ()
(gdb) ni
0xb7f59284 in ?? ()
(gdb) ni
0xb7f59284 in ?? ()
(gdb) ni
1
0xb7f5c119 in ?? ()
(gdb) ni
[Inferior 1 (process 717) exited normally]

I can't understand.

undefined-moe commented 2 years ago

I guess it's because of the jit module logic? Is there a way to disable it and simply interpret all the instructions?

copy commented 2 years ago

Meanwhile, I've noticed that gdb running on wokwi.com is functioning normally (see description at https://github.com/wokwi/web-gdb ), and breakpoints just work fine. So I don't think its a problem related to breakpoints.

That's a remote debugger for Arduino, so it doesn't matter if breakpoints are implemented in v86, the breakpoints are executed on the Arduino.

I guess it's because of the jit module logic? Is there a way to disable it and simply interpret all the instructions?

You can change this flag: https://github.com/copy/v86/blob/master/src/rust/config.rs#L2

undefined-moe commented 2 years ago

Ok I've confirmed that is not a problem related to jit. And I found this gdb 7.10 static link build just works. I guess new versions of gdb do some special trick with is not implemented in v86? Or just every static-linked gdb binary will work? Meanwhile @nickcao found in copy.sh archlinux demo vm, /lib/ld-linux.so.2 and hello.asm can be debugged with builtin gdb provided. (They both didn't link to another dylib)

undefined-moe commented 2 years ago

nixpkgs.pkgsCross.musl32.pkgsStatic.gdb (version12.1) also throws SIGSEGV. fine. I'm downgrading all dwarf to version 4.