cebix / macemu

Basilisk II and SheepShaver Macintosh emulators
1.38k stars 289 forks source link

GCC 4.6 and newer miscompile SheepShaver JIT #21

Open vasi opened 11 years ago

vasi commented 11 years ago

Dyngen requires that all op_* functions end in a 'ret' instruction on x86. But newer GCC will sometimes not put the ret there, instead using one or more jmp's.

We need to do one of the following:

  1. Come up with a different way to find the ends of the op_* functions; OR
  2. Find a way to force GCC to always place the 'ret' at the end of a function; OR
  3. Find an alternative to dyngen
asvitkine commented 11 years ago
  1. is probably the easiest. You'll need to disassemble what newer versions of GCC produce and see what dyngen could detect for the ends of the functions.

I'm not sure 2. is possible, but I suppose you could try playing around with GCC command-line flags to see if you can make it happen. If so, you'd need to modify the build files appropriately to pass those options to GCC.

One way to solve 3. is to check-in generated source files produced with dyngen that was ran on code produced by a working GCC. This way, compilers that are not supported by dyngen would use the pre-generated files instead of generating them themselves. We would need to check-in these files to source control for different architectures and modify the build system to use them appropriately. The nice thing about this approach is it would work with other compilers, such as clang.

Patches welcome!

-Alexei

On Sun, Mar 3, 2013 at 4:54 PM, Dave Vasilevsky notifications@github.comwrote:

Dyngen requires that all op_* functions end in a 'ret' instruction on x86. But newer GCC will sometimes not put the ret there, instead using one or more jmp's.

We need to do one of the following:

  1. Come up with a different way to find the ends of the op_* functions; OR
  2. Find a way to force GCC to always place the 'ret' at the end of a function; OR
  3. Find an alternative to dyngen

— Reply to this email directly or view it on GitHubhttps://github.com/cebix/macemu/issues/21 .

vasi commented 11 years ago

Actually 1 is looking a lot harder than I thought. We don't actually want the ret instruction, we're only using it as a marker. Suppose the GCC generated code looks like this:

cmp %eax, %ebx
je .MyGoto
ret
.MyGoto:
jmp *(%ecx)

Even if we detect the jmp, so we know where the function ends, we still have this embedded ret that we don't want. We would have to transform the function somehow, like turning the ret into a jmp .EndOfFunction. Ewww!

msliczniak commented 11 years ago

Did you try -mpush-args ?

vasi commented 8 years ago

I got dyngen working with GCC 5.3.0 from MacPorts on OS X 10.11, arch i386: https://github.com/vasi/macemu/tree/gcc5

I basically just provided some extra arguments to DYNGEN_CC, and also strengthened our dyngen_barrier(). This is just a patch-up, not a general solution to all our future dyngen problems.

I haven't yet tested with other compilers, versions of compilers, operating systems, or architectures. Any opinions?

vasi commented 8 years ago

I've been using the following Ruby script under OS X to check for problems with the *-dyngen-ops.o files:

#!/usr/bin/env ruby

def check_ret_at_end(lines)
  return false if lines.count { |l| l =~ /\sret.?\b/ } != 1
  lines.reverse.each do |line|
    next if line =~ /\snop.?\b/
    return line =~ /\sret.?\b/
  end
end

def check_file(file)
  IO.popen(['otool', '-tv', file]) do |io|
    io.slice_before(/^\.?\w+:$/).each do |func|
      next if func.first =~ /^\./
      next if func.first =~ /impl_op_invoke/
      next if func.last =~ /__TEXT/

      if !check_ret_at_end(func) || !func.grep(/\spush.?\b/).empty?
        puts func
      end
    end
  end
end

ARGV.each { |f| check_file(f) }
vasi commented 8 years ago

I just tried on Ubuntu Trusty, with the builtin GCC 4.8. Seems to work targeting i386, but not yet targeting amd64.

vasi commented 8 years ago

In a few minutes of randomly trying things, I haven't figured out what's wrong with amd64. It would probably be easier if I could get mon working, but I'm not sure how. I'll look into it later.

bill-mcgonigle commented 8 years ago

Seeing something similar on gcc-5.3.1-2.fc22.x86_64, with and without JIT, on x86_64. Got the following crash/trace. Sounds a little bit like this, though it's outside my realm of expertise:

https://sourceware.org/ml/libc-alpha/2014-07/msg00020.html

Program received signal SIGSEGV, Segmentation fault.                                                                                                                                                                                               
__memset_avx2 () at ../sysdeps/x86_64/multiarch/memset-avx2.S:135                                                                                                                                                                                  
135             vmovdqu %ymm0, (%rax)                                                                                                                                                                                                              
Missing separate debuginfos, use: dnf debuginfo-install adwaita-gtk2-theme-3.16.2-1.fc22.x86_64 elfutils-libelf-0.163-4.fc22.x86_64 elfutils-libs-0.163-4.fc22.x86_64 libxshmfence-1.2-1.fc22.x86_64                                               
(gdb) thread appl all bt full

Thread 3 (Thread 0x7fffed691700 (LWP 13031)):                                                                                                                                                                                                      
#0  0x000000397c4f73c1 in __GI_ppoll (fds=0x7840fa50, nfds=2, timeout=<optimized out>, sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:56                                                                                         
        resultvar = 18446744073709551102                                                                                                                                                                                                           
        oldtype = 2                                                                                                                                                                                                                                
        tval = {tv_sec = -1, tv_nsec = 211669895678}                                                                                                                                                                                               
        result = <optimized out>                                                                                                                                                                                                                   
#1  0x0000003148824e8d in pa_mainloop_poll (__ss=0x0, __timeout=<optimized out>, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77                                                                                    
        ts = {tv_sec = 2017952768, tv_nsec = 2016402256}                                                                                                                                                                                           
        __func__ = "pa_mainloop_poll"                                                                                                                                                                                                              
        __PRETTY_FUNCTION__ = "pa_mainloop_poll"                                                                                                                                                                                                   
#2  0x0000003148824e8d in pa_mainloop_poll (m=m@entry=0x78478400) at pulse/mainloop.c:852                                                                                                                                                          
        ts = {tv_sec = 2017952768, tv_nsec = 2016402256}                                                                                                                                                                                           
        __func__ = "pa_mainloop_poll"                                                                                                                                                                                                              
        __PRETTY_FUNCTION__ = "pa_mainloop_poll"                                                                                                                                                                                                   
#3  0x000000314882547e in pa_mainloop_iterate (m=0x78478400, block=<optimized out>, retval=0x0) at pulse/mainloop.c:926                                                                                                                            
        r = 0                                                                                                                                                                                                                                      
#4  0x0000003995835d4d in PULSE_WaitAudio (this=0x782fdb50) at src/audio/pulse/SDL_pulseaudio.c:310                                                                                                                                                
        size = <optimized out>                                                                                                                                                                                                                     
#5  0x0000003995808950 in SDL_RunAudio (audiop=audiop@entry=0x782fdb50) at src/audio/SDL_audio.c:222                                                                                                                                               
        audio = 0x782fdb50                                                                                                                                                                                                                         
        stream = 0x784c6730 ""                                                                                                                                                                                                                     
        stream_len = 8192                                                                                                                                                                                                                          
        udata = 0x0                                                                                                                                                                                                                                
        fill = 0x780830e0 <stream_func(void*, uint8*, int)>                                                                                                                                                                                        
        silence = 0                                                                                                                                                                                                                                
#6  0x00000039958111e8 in SDL_RunThread (data=0x782bc310) at src/thread/SDL_thread.c:204                                                                                                                                                           
        args = 0x782bc310
        userfunc = 0x3995808870 <SDL_RunAudio>
        userdata = 0x782fdb50
        statusloc = 0x784cf250
#7  0x00000039958546d9 in RunThread (data=<optimized out>) at src/thread/pthread/SDL_systhread.c:47
#8  0x000000397cc07555 in start_thread (arg=0x7fffed691700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7fffed691700
        now = <optimized out>
        unwind_buf = 
              {cancel_jmp_buf = {{jmp_buf = {140737176475392, -765224196865786964, 140737488345327, 140737176475392, 8388608, 8388608, 765254009858030508, -787117124081550420}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
#9  0x000000397c502b9d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 2 (Thread 0x7ffff012b700 (LWP 13032)):
#0  0x000000397c510eba in __clock_nanosleep (clock_id=clock_id@entry=0, flags=flags@entry=1, req=req@entry=0x782937f0, rem=rem@entry=0x0) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:48
        oldstate = 0
        r = <optimized out>
        rem = 0x0
        req = 0x782937f0
        flags = 1
        clock_id = <optimized out>
#1  0x000000007805ddbe in timer_func(void*) (arg=<optimized out>) at ../timer.cpp:585
        system_time = {tv_sec = 0, tv_nsec = 0}
#2  0x000000397cc07555 in start_thread (arg=0x7ffff012b700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7ffff012b700
        now = <optimized out>
        unwind_buf = 
              {cancel_jmp_buf = {{jmp_buf = {140737221146368, -765224196865786964, 140737488345647, 140737221146368, 8388608, 0, 765190277912066988, -787117124081550420}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
#3  0x000000397c502b9d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 1 (Thread 0x7ffff7f96a00 (LWP 12819)):
#0  0x000000397c56d40e in __memset_avx2 () at ../sysdeps/x86_64/multiarch/memset-avx2.S:135
#1  0x000000007807d06d in open_display() (__len=<optimized out>, __ch=0, __dest=<optimized out>) at /usr/include/bits/string3.h:90
---Type <return> to continue, or q <return> to quit---
        num = <optimized out>
        display_open = true
#2  0x000000007807d06d in open_display() () at video_x.cpp:1029
        num = <optimized out>
        display_open = true
#3  0x000000007807ee4c in VideoInit() () at video_x.cpp:1651
        vm_event_base = 0
        vm_error_base = 155
        has_fbdev_dga = <optimized out>
        default_mode = 133
        mode_str = <optimized out>
        default_width = 1024
        default_height = 768
        window_modes = <optimized out>
        screen_modes = <optimized out>
        p = 0x78298f2c
        __PRETTY_FUNCTION__ = "bool VideoInit()"
#4  0x0000000078050d84 in InitAll(char const*) (vmdir=vmdir@entry=0x0) at ../main.cpp:157
        i16 = <optimized out>
#5  0x0000000078052755 in main(int, char**) (argc=1, argv=0x7fffffffdd88) at main_unix.cpp:991
        str = "\b\000\000\000\000\000\000\000\210\335\377\377\377\177\000\000\b\000\000\000\000\000\000\000p\001e\201\071", '\000' <repeats 11 times>, "n4H|9\000\000\000[\000\000\000n", '\000' <repeats 19 times>, "\330IH|9\000\000\000`#,x", '\000' <repeats 20 times>, "\004MH|9\000\000\000\360\333\377\377\377\177\000\000 \253{|9\000\000\000 ", '\000' <repeats 15 times>, "\b\000\000\000\000\000\000\000\210\335\377\377\377\177\000\000\230\335\377\377\377\177\000\000p\001e\201\071", '\000' <repeats 11 times>, "\300\305\vx\000\000\000\000h6\016x\000\000\000\000"...
        memory_mapped_from_zero = <optimized out>
        ram_rom_areas_contiguous = <optimized out>
        vmdir = <optimized out>