opentk / opentk

The Open Toolkit library is a fast, low-level C# wrapper for OpenGL, OpenAL & OpenCL. It also includes windowing, mouse, keyboard and joystick input and a robust and fast math library, giving you everything you need to write your own renderer or game engine. OpenTK can be used standalone or inside a GUI on Windows, Linux, Mac.
https://opentk.net
Other
3.2k stars 629 forks source link

OpenTK 3.3.2 fatal crash on OS X 10.11.6 #1329

Open Barbosik opened 3 years ago

Barbosik commented 3 years ago

Description

Previously I was used OpenTK 1.1.4272 and it works ok on WinXP and OS X. But now I trying to update OpenTK 3.3.2 version (it looks that this is the last version which supports .NET framework 4.0).

OpenTK 3.3.2 works faster on Win7, but it cannot start on OS X and fails with fatal application crash.

Repro steps

  1. Create simple OpenGL app with reference to OpenTK 3.3.2 parameters for GameWindow ctor:

    partial class App : GameWindow
    {
        public App(CommandService commandService, ConsoleService consoleService)
            : base(1024, 768, new OpenTK.Graphics.GraphicsMode(
                new ColorFormat(8,8,8,8), 24, 8, 16), "OpenGL")
        {
  2. Run it on Win7

  3. Run it on WinXP

  4. Run it on OS X

Expected behavior

Application run ok on all three OS

Actual behavior

Application runs on WinXP and Win7 with no issue, but crashes at startup on OS X.

mono OpenGL.exe
Stacktrace:

  at <unknown> <0xffffffff>
  at OpenTK.Platform.MacOS.NSApplication..cctor () <0x00327>
  at (wrapper runtime-invoke) object.runtime_invoke_void (object,intptr,intptr,intptr) <0xffffffff>
  at <unknown> <0xffffffff>
  at OpenTK.Platform.MacOS.MacOSFactory..ctor () <0x0001b>
  at OpenTK.Platform.Factory..ctor () <0x0009b>
  at OpenTK.Toolkit.Init (OpenTK.ToolkitOptions) <0x001cf>
  at OpenTK.Toolkit.Init () <0x00013>
  at OpenTK.Platform.Factory..cctor () <0x0000b>
  at (wrapper runtime-invoke) object.runtime_invoke_void (object,intptr,intptr,intptr) <0xffffffff>
  at <unknown> <0xffffffff>
  at OpenTK.DisplayDevice..cctor () <0x0002b>
  at (wrapper runtime-invoke) object.runtime_invoke_void (object,intptr,intptr,intptr) <0xffffffff>
  at <unknown> <0xffffffff>
  at OpenTK.GameWindow..ctor (int,int,OpenTK.Graphics.GraphicsMode,string) <0x0000b>
  at OpenGL.App..ctor (Common.Services.Commands.CommandService,Common.Services.Consoles.ConsoleService) <0x003bb>
  at OpenGL.Program/<>c__DisplayClassc.<MainSafe>b__7 () <0x0004f>
  at Common.ThreadHelper.ExecuteSafe (System.Action,System.Action`1<bool>) <0x00075>
  at Common.ThreadHelper.ExecuteSafe (System.Action) <0x0001b>
  at OpenGL.Program.MainSafe (string[]) <0x00567>
  at OpenGL.Program.Main (string[]) <0x00077>
  at (wrapper runtime-invoke) <Module>.runtime_invoke_void_object (object,intptr,intptr,intptr) <0xffffffff>

Native stacktrace:

Debug info from gdb:

(lldb) command source -s 0 '/tmp/mono-gdb-commands.Dinjkx'
Executing commands in '/tmp/mono-gdb-commands.Dinjkx'.
(lldb) process attach --pid 517
warning: (i386) /Library/Frameworks/Mono.framework/Versions/4.2.1/lib/mono/4.5/mscorlib.dll.dylib empty dSYM file detected, dSYM was created with an executable with no debug info.
Process 517 stopped
* thread #1: tid = 0x2479, 0x9060dcee libsystem_kernel.dylib`__wait4 + 10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
    frame #0: 0x9060dcee libsystem_kernel.dylib`__wait4 + 10
libsystem_kernel.dylib`__wait4:
->  0x9060dcee <+10>: jae    0x9060dcfe                ; <+26>
    0x9060dcf0 <+12>: calll  0x9060dcf5                ; <+17>
    0x9060dcf5 <+17>: popl   %edx
    0x9060dcf6 <+18>: movl   0x11f0932f(%edx), %edx

Executable module set to "/usr/local/bin/mono".
Architecture set to: i386-apple-macosx.
(lldb) thread list
Process 517 stopped
* thread #1: tid = 0x2479, 0x9060dcee libsystem_kernel.dylib`__wait4 + 10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
  thread #2: tid = 0x247c, 0x9060d3ea libsystem_kernel.dylib`__psynch_cvwait + 10
  thread #3: tid = 0x247d, 0x906064d6 libsystem_kernel.dylib`semaphore_wait_trap + 10
  thread #4: tid = 0x247f, 0x9060dd5e libsystem_kernel.dylib`__workq_kernreturn + 10
  thread #5: tid = 0x2480, 0x9060e7fa libsystem_kernel.dylib`kevent_qos + 10, queue = 'com.apple.libdispatch-manager'
  thread #6: tid = 0x2481, 0x9060dd5e libsystem_kernel.dylib`__workq_kernreturn + 10
  thread #7: tid = 0x2488, 0x9060dd5e libsystem_kernel.dylib`__workq_kernreturn + 10
  thread #8: tid = 0x2489, 0x9060d3ea libsystem_kernel.dylib`__psynch_cvwait + 10
  thread #9: tid = 0x2497, 0x9060dd5e libsystem_kernel.dylib`__workq_kernreturn + 10
(lldb) thread backtrace all
* thread #1: tid = 0x2479, 0x9060dcee libsystem_kernel.dylib`__wait4 + 10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
  * frame #0: 0x9060dcee libsystem_kernel.dylib`__wait4 + 10
    frame #1: 0x9c18b7e0 libsystem_c.dylib`waitpid$UNIX2003 + 48
    frame #2: 0x0015928d mono`mono_handle_native_sigsegv(signal=11, ctx=0x007b3fe0, info=0x007b3fa0) + 541 at mini-exceptions.c:2193 [opt]
    frame #3: 0x001a5742 mono`mono_arch_handle_altstack_exception(sigctx=<unavailable>, siginfo=<unavailable>, fault_addr=<unavailable>, stack_ovf=0) + 162 at exceptions-x86.c:1097 [opt]
    frame #4: 0x000a6f2e mono`mono_sigsegv_signal_handler(_dummy=<unavailable>, _info=<unavailable>, context=<unavailable>) + 446 at mini-runtime.c:2461 [opt]
    frame #5: 0x9597879b libsystem_platform.dylib`_sigtramp + 43
    frame #6: 0x0012047b mono`mono_local_cprop(cfg=<unavailable>) + 331 at local-propagation.c:85 [opt]
    frame #7: 0x000a1689 mono`mini_method_compile(method=<unavailable>, opts=<unavailable>, domain=<unavailable>, flags=<unavailable>, parts=2062621392, aot_method_index=<unavailable>) + 4057 at mini.c:3539 [opt]
    frame #8: 0x000a39bf mono`mono_jit_compile_method_inner(method=0x7b52e130, target_domain=<unavailable>, opt=378628607, jit_ex=<unavailable>) + 655 at mini.c:4063 [opt]
    frame #9: 0x000a68b2 mono`mono_jit_compile_method_with_opt(method=<unavailable>, opt=<unavailable>, ex=<unavailable>) + 738 at mini-runtime.c:1894 [opt]
    frame #10: 0x000a6579 mono`mono_jit_compile_method(method=0x7b52e130) + 57 at mini-runtime.c:1931 [opt]
    frame #11: 0x0015ad3e mono`common_call_trampoline [inlined] common_call_trampoline_inner(m=<unavailable>, vt=0x004af000, vtable_slot=0x00000000) + 1096 at mini-trampolines.c:570 [opt]
    frame #12: 0x0015a8f6 mono`common_call_trampoline(regs=<unavailable>, code=<unavailable>, m=<unavailable>, vt=0x004af000, vtable_slot=0x00000000) + 70 at mini-trampolines.c:684 [opt]
    frame #13: 0x0015a8a4 mono`mono_magic_trampoline(regs=0xbff652f0, code="\x89Eԋ\x05T??z\x89E??\x04$`\x03y", arg=0x7b52e130, tramp=0x00000000) + 52 at mini-trampolines.c:699 [opt]
    frame #14: 0x006ce088

  thread #2: tid = 0x247c, 0x9060d3ea libsystem_kernel.dylib`__psynch_cvwait + 10
    frame #0: 0x9060d3ea libsystem_kernel.dylib`__psynch_cvwait + 10
    frame #1: 0x926e2538 libsystem_pthread.dylib`_pthread_cond_wait + 757
    frame #2: 0x926e4276 libsystem_pthread.dylib`pthread_cond_wait$UNIX2003 + 71
    frame #3: 0x002cc852 mono`thread_func(thread_data=0x00000000) + 466 at sgen-thread-pool.c:118 [opt]
    frame #4: 0x926e1780 libsystem_pthread.dylib`_pthread_body + 138
    frame #5: 0x926e16f6 libsystem_pthread.dylib`_pthread_start + 155
    frame #6: 0x926def7a libsystem_pthread.dylib`thread_start + 34

  thread #3: tid = 0x247d, 0x906064d6 libsystem_kernel.dylib`semaphore_wait_trap + 10
    frame #0: 0x906064d6 libsystem_kernel.dylib`semaphore_wait_trap + 10
    frame #1: 0x002ef34a mono`mono_sem_wait(sem=0x003cf08c, alertable=1) + 26 at mono-semaphore.c:109 [opt]
    frame #2: 0x0026c44e mono`finalizer_thread(unused=0x00000000) + 158 at gc.c:1096 [opt]
    frame #3: 0x0024666c mono`start_wrapper [inlined] start_wrapper_internal + 463 at threads.c:723 [opt]
    frame #4: 0x0024649d mono`start_wrapper(data=<unavailable>) + 29 at threads.c:770 [opt]
    frame #5: 0x002f6ee0 mono`inner_start_thread(arg=<unavailable>) + 240 at mono-threads-posix.c:97 [opt]
    frame #6: 0x926e1780 libsystem_pthread.dylib`_pthread_body + 138
    frame #7: 0x926e16f6 libsystem_pthread.dylib`_pthread_start + 155
    frame #8: 0x926def7a libsystem_pthread.dylib`thread_start + 34

  thread #4: tid = 0x247f, 0x9060dd5e libsystem_kernel.dylib`__workq_kernreturn + 10
    frame #0: 0x9060dd5e libsystem_kernel.dylib`__workq_kernreturn + 10
    frame #1: 0x926e134b libsystem_pthread.dylib`_pthread_wqthread + 1289
    frame #2: 0x926def56 libsystem_pthread.dylib`start_wqthread + 34

  thread #5: tid = 0x2480, 0x9060e7fa libsystem_kernel.dylib`kevent_qos + 10, queue = 'com.apple.libdispatch-manager'
    frame #0: 0x9060e7fa libsystem_kernel.dylib`kevent_qos + 10
    frame #1: 0x915217ea libdispatch.dylib`_dispatch_mgr_invoke + 234
    frame #2: 0x915213be libdispatch.dylib`_dispatch_mgr_thread + 52

  thread #6: tid = 0x2481, 0x9060dd5e libsystem_kernel.dylib`__workq_kernreturn + 10
    frame #0: 0x9060dd5e libsystem_kernel.dylib`__workq_kernreturn + 10
    frame #1: 0x926e134b libsystem_pthread.dylib`_pthread_wqthread + 1289
    frame #2: 0x926def56 libsystem_pthread.dylib`start_wqthread + 34

  thread #7: tid = 0x2488, 0x9060dd5e libsystem_kernel.dylib`__workq_kernreturn + 10
    frame #0: 0x9060dd5e libsystem_kernel.dylib`__workq_kernreturn + 10
    frame #1: 0x926e134b libsystem_pthread.dylib`_pthread_wqthread + 1289
    frame #2: 0x926def56 libsystem_pthread.dylib`start_wqthread + 34

  thread #8: tid = 0x2489, 0x9060d3ea libsystem_kernel.dylib`__psynch_cvwait + 10
    frame #0: 0x9060d3ea libsystem_kernel.dylib`__psynch_cvwait + 10
    frame #1: 0x926e2538 libsystem_pthread.dylib`_pthread_cond_wait + 757
    frame #2: 0x926e4276 libsystem_pthread.dylib`pthread_cond_wait$UNIX2003 + 71
    frame #3: 0x002d2245 mono`_wapi_handle_timedwait_signal_handle(handle=<unavailable>, timeout=<unavailable>, alertable=<unavailable>, poll=<unavailable>) + 485 at handles.c:1609 [opt]
    frame #4: 0x002d22f8 mono`_wapi_handle_wait_signal_handle(handle=0x00000108, alertable=1) + 40 at handles.c:1554 [opt]
    frame #5: 0x002e1d4d mono`wapi_WaitForSingleObjectEx(handle=<unavailable>, timeout=<unavailable>, alertable=2067997228) + 493 at wait.c:194 [opt]
    frame #6: 0x00242622 mono`ves_icall_System_Threading_WaitHandle_WaitOne_internal [inlined] mono_wait_uninterrupted(alertable=1) + 34 at threads.c:1447 [opt]
    frame #7: 0x00242600 mono`ves_icall_System_Threading_WaitHandle_WaitOne_internal(this=0x00806fe8, handle=<unavailable>, ms=-1, exitContext=0) + 80 at threads.c:1581 [opt]
    frame #8: 0x007249d8
    frame #9: 0x01b0a028 mscorlib.dll.dylib`System_Threading_WaitHandle_WaitOne + 104
    frame #10: 0x0071f3e2
    frame #11: 0x0194a19d mscorlib.dll.dylib`System_Threading_ThreadHelper_ThreadStart_Context_object + 189
    frame #12: 0x01948955 mscorlib.dll.dylib`System_Threading_ExecutionContext_RunInternal_System_Threading_ExecutionContext_System_Threading_ContextCallback_object_bool + 421
    frame #13: 0x019487a4 mscorlib.dll.dylib`System_Threading_ExecutionContext_Run_System_Threading_ExecutionContext_System_Threading_ContextCallback_object_bool + 52
    frame #14: 0x0194871b mscorlib.dll.dylib`System_Threading_ExecutionContext_Run_System_Threading_ExecutionContext_System_Threading_ContextCallback_object + 91
    frame #15: 0x0194a20d mscorlib.dll.dylib`System_Threading_ThreadHelper_ThreadStart_object + 77
    frame #16: 0x00717e5f
    frame #17: 0x000a9d9a mono`mono_jit_runtime_invoke(method=<unavailable>, obj=<unavailable>, params=<unavailable>, exc=<unavailable>) + 714 at mini-runtime.c:2334 [opt]
    frame #18: 0x0026e54f mono`mono_runtime_invoke(method=0x7c44e258, obj=0x008110b8, params=<unavailable>, exc=<unavailable>) + 127 at object.c:2783 [opt]
    frame #19: 0x00273a4c mono`mono_runtime_delegate_invoke(delegate=0x008110b8, params=<unavailable>, exc=<unavailable>) + 92 at object.c:3494 [opt]
    frame #20: 0x002466e5 mono`start_wrapper [inlined] start_wrapper_internal + 584 at threads.c:729 [opt]
    frame #21: 0x0024649d mono`start_wrapper(data=<unavailable>) + 29 at threads.c:770 [opt]
    frame #22: 0x002f6ee0 mono`inner_start_thread(arg=<unavailable>) + 240 at mono-threads-posix.c:97 [opt]
    frame #23: 0x926e1780 libsystem_pthread.dylib`_pthread_body + 138
    frame #24: 0x926e16f6 libsystem_pthread.dylib`_pthread_start + 155
    frame #25: 0x926def7a libsystem_pthread.dylib`thread_start + 34

  thread #9: tid = 0x2497, 0x9060dd5e libsystem_kernel.dylib`__workq_kernreturn + 10
    frame #0: 0x9060dd5e libsystem_kernel.dylib`__workq_kernreturn + 10
    frame #1: 0x926e134b libsystem_pthread.dylib`_pthread_wqthread + 1289
    frame #2: 0x926def56 libsystem_pthread.dylib`start_wqthread + 34
(lldb) detach

=================================================================
Got a SIGSEGV while executing native code. This usually indicates
a fatal error in the mono runtime or one of the native libraries 
used by your application.
=================================================================

Process 517 detached
(lldb) Abort trap: 6

Related information

Any idea on how to fix it?

Barbosik commented 3 years ago

minimal test app source code to reproduce issue:

using System;
using System.Drawing;
using OpenTK;
using OpenTK.Graphics;
using OpenTK.Graphics.OpenGL;

namespace TEST
{
    class Program
    {
        static void Main(string[] args)
        {
            using (var app = new App())
                app.Run();
        }
    }

    class App : GameWindow
    {
        public App()
            : base(1024, 768, new OpenTK.Graphics.GraphicsMode(
                new ColorFormat(8, 8, 8, 8), 24, 8, 16), "OpenGL")
        {
        }

        protected override void OnRenderFrame(FrameEventArgs e)
        {
            GL.ClearColor(Color.Green);
            GL.Clear(ClearBufferMask.ColorBufferBit);
            base.SwapBuffers();
        }
    }
}
NogginBops commented 3 years ago

This sounds like it might be a regression with 3.3.2, probably related to this PR https://github.com/opentk/opentk/pull/1309

I can't tell from the version number alone but maybe the api used in that PR doesn't exist on your OS X version?

Could you try 3.3.1 and see if you have better luck with that version?

Barbosik commented 3 years ago

Tried with different OpenTK versons available on NuGet.

OpenTK 2.0.0 works ok on OS X OpenTK 3.0.0 crashes on OS X OpenTK 3.1.0 crashes on OS X OpenTK 3.2.0 crashes on OS X OpenTK 3.3.1 crashes on OS X OpenTK 3.3.2 crashes on OS X

I don't see versions between 2.0.0 and 3.0.0. Something was changed between these versions and it leads to crash...

NogginBops commented 3 years ago

Thanks a lot for testing the different versions, that is valuable information to have.

Where I would look next is to see if it's possible to find any more detailed information on exactly where it's crashing here at OpenTK.Platform.MacOS.NSApplication..cctor () <0x00327>

Maybe setting some kind of breakpoint through a debugger or stepping through the assembly could be a way of doing that (if there is no better/easier way)

Barbosik commented 3 years ago

I can't run debugger on OS X. I tried to compare code and tested some suspicious points (Cocoa.SetFloat, NS.LoadLibrary, NSApplication.cctor: Cocoa.SendVoid(Handle, Selector.Get("discardEventsMatchingMask:beforeEvent:")), but it looks that these points are ok.

It looks very strange, because at a glance there is almost the same code for 2.0.0 and 3.3.2, but for some unknown reason 3.3.2 version crashes...

Update: added detailed logging and found that the crash happens within this if block (NSApplication.cctor):

if (Cocoa.SendIntPtr(Handle, Selector.Get("mainMenu")) == IntPtr.Zero)
{
}

I added more logs to catch where it happens exactly. More details will be available soon (compilation takes a lot of time)

Barbosik commented 3 years ago

Found it. The crash happens at this line:

var quitMenuItem = Cocoa.SendIntPtr(Cocoa.SendIntPtr(Class.Get("NSMenuItem"), Selector.Alloc),
    Selector.Get("initWithTitle:action:keyEquivalent:"), Cocoa.ToNSString("Quit"), selQuit, Cocoa.ToNSString("q"));

Exactly the same code works ok in v2.0.0. But crashes in v3.3.2...

I tried to comment this block of code:

/*
var quitMenuItem = Cocoa.SendIntPtr(Cocoa.SendIntPtr(Class.Get("NSMenuItem"), Selector.Alloc),
    Selector.Get("initWithTitle:action:keyEquivalent:"), Cocoa.ToNSString("Quit"), selQuit, Cocoa.ToNSString("q"));

    Cocoa.SendIntPtr(appMenu, Selector.Get("addItem:"), quitMenuItem);
    Cocoa.SendIntPtr(menuItem, Selector.Get("setSubmenu:"), appMenu);
*/

after that it crashes on this line:

// Initialize and register the settings dictionary
settings =
    Cocoa.SendIntPtr(settings, Selector.Get("initWithObjectsAndKeys:"),
    //momentum_scrolling, Cocoa.ToNSString("AppleMomentumScrollSupported"),
    press_and_hold, Cocoa.ToNSString("ApplePressAndHoldEnabled"),
    IntPtr.Zero);

so, I commented all code related to settings variable, but after that it crashes here:

Factory.ctor exit
Toolkit.Init exit
Factory.cctor exit
Stacktrace:

  at <unknown> <0xffffffff>
  at OpenTK.Platform.MacOS.CocoaNativeWindow..cctor () <0x00553>
  at (wrapper runtime-invoke) object.runtime_invoke_void (object,intptr,intptr,intptr) <0xffffffff>
  at <unknown> <0xffffffff>

Also I tried to copy Platform/MacOS folder into old OpenTK 1.1 source. I commented some code related to joystick and it works on OS X!

It looks that something is going wrong with initialization sequence. Any idea how to fix it?

leezer3 commented 3 years ago

Your commenting isn't actually too helpful :) It's crashing as Cocoa has been passed an invalid pointer at some stage, and isn't liking the fact. This is just the first place it surfaces.

The more interesting question is where the invalid pointer is coming from in the first place.

Some quick skating through the OS-X commits appears to show that the following changed the OS-X backend between 2.0 and 3.0: https://github.com/opentk/opentk/commit/8ec577b9ca25117af326a7131a51a11a46ffdf99#diff-a94ba05d16c7a8aa059e72c763bdfb1452983d89509d3b32689d3c9684a9d230 https://github.com/opentk/opentk/commit/91b03ddcd5c127fc2a4db7cd62e5b6220ae11148#diff-a94ba05d16c7a8aa059e72c763bdfb1452983d89509d3b32689d3c9684a9d230 https://github.com/opentk/opentk/commit/2d31165baa865c44cba30ba1d217e9f814b1a44e#diff-a94ba05d16c7a8aa059e72c763bdfb1452983d89509d3b32689d3c9684a9d230 https://github.com/opentk/opentk/commit/e598ab28174f7cd7d1134542e6a1581703d2a85a#diff-a94ba05d16c7a8aa059e72c763bdfb1452983d89509d3b32689d3c9684a9d230 https://github.com/opentk/opentk/commit/286119ea68e19f473c29783f88ecf6decbc07d68#diff-a94ba05d16c7a8aa059e72c763bdfb1452983d89509d3b32689d3c9684a9d230 https://github.com/opentk/opentk/commit/ea3dd481a57badec89fe5da001e58af0b7649199#diff-a94ba05d16c7a8aa059e72c763bdfb1452983d89509d3b32689d3c9684a9d230 https://github.com/opentk/opentk/commit/44293245ec3fa273631ad93d6c910857e2fade43#diff-a94ba05d16c7a8aa059e72c763bdfb1452983d89509d3b32689d3c9684a9d230 https://github.com/opentk/opentk/commit/6e67734cc1d322d2384778b4e305190a55efdd9f#diff-a94ba05d16c7a8aa059e72c763bdfb1452983d89509d3b32689d3c9684a9d230

The only one which seems remotely suspicious to me is this: https://github.com/opentk/opentk/commit/8ec577b9ca25117af326a7131a51a11a46ffdf99#diff-a94ba05d16c7a8aa059e72c763bdfb1452983d89509d3b32689d3c9684a9d230

The ppy branch has had troubles with this patch, although not crashing: https://github.com/ppy/osuTK/pull/63/files

Barbosik commented 3 years ago

I tested these changes as a first step. I tried to add these changes https://github.com/opentk/opentk/commit/8ec577b9ca25117af326a7131a51a11a46ffdf99#diff-a94ba05d16c7a8aa059e72c763bdfb1452983d89509d3b32689d3c9684a9d230

into old version v1.1 and it works ok on OS X with these changes. Also I tried to copy entire folder with source code Platform/MacOS from version 3.3.2 to 1.1. It requires to comment some code related to joystick, because old interfaces doesn't have such functionality, but after that a new Platform/MacOS code from version 3.3.2 works ok on OS X with old OpenTK.

leezer3 commented 3 years ago

You'd be better off checking out commits from 2.0 onwards (go up say 30 odd at a time and go back if it's broken) and building that way.

Applying to 1.0 will probably create other messes :)

Barbosik commented 3 years ago

You're right, but solutions takes time for build and space on the disk, But it looks that trying to update 2.0 to 3.0 step by step with testing is more effective way to find problematic point than testing separate changes on 1.1 or 3.3

leezer3 commented 3 years ago

It shouldn't take any more disk space if you're doing it right.

That's one of the beauties of GIT- Everything ever done is stored in the repo & can be checked out with a single command. https://www.git-tower.com/learn/git/faq/git-checkout-commits/

If you let us know what client, we can probably do better instructions :)

Barbosik commented 3 years ago

v2.0.0 build is ok, but v3.0.0 build is failed. I start build.cmd but it fails with errors related to a new C# syntax (property initializers). Tried to rewrite code for Generator.Bind and Generator.Rewrite to support old syntax. But now it shows the same errors for OpenTK.GLControl and it needs too much time to fix it.

Is there a way to enable new C# syntax for build.cmd? Tried to add LangVersion 9 but it doesn't works with build.cmd. Tried to add framework: net48 for paket.dependencies but it also doesn't help. I'm using VS2019 with .net 4.8 sdk installed. It looks that the problem is that build.cmd from v3.0.0 uses csc command line from .NET 4.0 which doesn't supports new C# sugar.

NogginBops commented 3 years ago

This declaration and use is highly suspicious.

public extern static void SendVoid(IntPtr receiver, IntPtr selector, uint uint1, IntPtr intPtr1);

Cocoa.SendVoid(Handle, Selector.Get("discardEventsMatchingMask:beforeEvent:"), uint.MaxValue, IntPtr.Zero);

The apple documentation says that the discardEventsMatchingMask:beforeEvent: method takes a (NSEventMask) which is defined as a unsigned long long which means it's a 64 bit value. But the c# declaration uses uint instead of ulong.

I can't test this hypothesis as I don't have a mac, but if you could try changing it to be:

public extern static void SendVoid(IntPtr receiver, IntPtr selector, ulong ulong1, IntPtr intPtr1);

Cocoa.SendVoid(Handle, Selector.Get("discardEventsMatchingMask:beforeEvent:"), ulong.MaxValue, IntPtr.Zero);

and see if it runs that would be amazing.

leezer3 commented 3 years ago

I can build/ test for regression, but that'd be on a High Sierra hackintosh.

No.comment on how much effort to make over 10.11 though, that's well out of support.

NogginBops commented 3 years ago

What I'm surprised by is the fact that this doesn't crash on more modern macos versions. It's clearly calling the function with the wrong parameter types, but I'm guessing the c ABI on x86 is saving us here?

Or it's still broken in some configurations like 32bit or 64bit.

leezer3 commented 3 years ago

I've tested on both x32 and x64 with no ill effects (although not the repro above)

Suggests to me its going to be thread dependant somehow as to whether it crashes or not. Possibly the minimal is calling less setup functions on something?

NogginBops commented 3 years ago

Suggests to me its going to be thread dependant somehow as to whether it crashes or not. Possibly the minimal is calling less setup functions on something?

Why would it be thread dependent?

leezer3 commented 3 years ago

Because the broken function causes the OS to purge the pending event queue. Speculation here, but it looks to me as if it's creating a mangled pointer to 'somewhere' in the event queue as a result of the mistyping. A minimal repro would have very few to no events pending whereas somethung like a full game will probably have various things spinning before the window init and this is called.

NogginBops commented 3 years ago

Ah ok I see now. I thought you mean that the function would fail to be called properly depending on some thread thing. I was thinking that it was this function call that caused the crash, but yes I agree that it's likely the function call has some unintended behavior that appears only later in the program causing the crash.

Barbosik commented 2 years ago

I catch the issue on 32-bit OS X and it can be stable reproduced with minimal single threaded app. Here is source code of test which allows to reproduce the issue:

using System;
using System.Drawing;
using OpenTK;
using OpenTK.Graphics;
using OpenTK.Graphics.OpenGL;

namespace TEST
{
    class Program
    {
        static void Main(string[] args)
        {
            using (var app = new App())
                app.Run();
        }
    }

    class App : GameWindow
    {
        public App()
            : base(1024, 768, new OpenTK.Graphics.GraphicsMode(
                new ColorFormat(8, 8, 8, 8), 24, 8, 16), "OpenGL")
        {
        }

        protected override void OnRenderFrame(FrameEventArgs e)
        {
            GL.ClearColor(Color.Green);
            GL.Clear(ClearBufferMask.ColorBufferBit);
            base.SwapBuffers();
        }
    }
}

I tried to remove this call:

Cocoa.SendVoid(Handle, Selector.Get("discardEventsMatchingMask:beforeEvent:"), uint.MaxValue, IntPtr.Zero);

it doesn't affect the issue. Also I tried to add this call to an old v1.1 and it also works ok with this call.

Unfortunately I caught a sick and temporary cannot continue to investigate it. I stuck at attempt to build v3.0.0, because build is failed due to a new C# syntax. For some unknown reason build.cmd uses csc compiler from ,net framework 4.0 which don't support new C# syntax and I didn't find how to fix it.

leezer3 commented 2 years ago

Build.cmd works here (Windows 10, current VS2019) Build.sh also seems to work OK on Debian.

What system are you running?

You might also want to try checking out 3x into a completely clean folder to rule out something from 1x getting into the way.

leezer3 commented 2 years ago

Another smell: https://github.com/opentk/opentk/blob/3.x/src/OpenTK/Platform/MacOS/Quartz/DisplayServices.cs#L132-L157

I can't see an immediate reason why the NSPoint should be float on 32-bit, double on 64-bit. The NSPoint can be of int, float or double type & this function doesn't seem to require any of them specifically.


Some further questions for @Barbosik

Mistik commented 2 years ago

@Barbosik what is your email? I think I have a solution to the issue and wish to discuss with you

NogginBops commented 2 years ago

@Mistik is there any reason why you can't suggest this solution here in this issue?

NogginBops commented 2 years ago

As we don't have a solution for this ATM I'm going to post-pone it to 3.3.4 (if we are fixing this issue that is).

Mistik commented 1 year ago

@Barbosik did you manage to resolve the issue

leezer3 commented 1 year ago

See my linked comment? If your menu bar is hidden, this may well be causing the crash.

Barbosik commented 1 year ago

hi, I'm working on my own OpenGL binding and found that there is discrepancy for a struct alignment between x86 and x64 Linux/MacOS platform. I didn't yet tested it deep for MacOS, but at a glance it looks that it may be the reason for crash on x86 platform.

The issue here is that linux-kind OS use 4-bytes struct alignment for x86 platform and 8-bytes struct alingnment for x64 platform. That is bad issue, because there is no way in C# to setup StructLayout Pack attribute size which will depend on x86/x64 platform. So, for proper binding there is need for a two sets of struct and two interop definitions for x86 and for x64 platform.

But I found some workaround hack which allows to use the same struct definition, at least in some cases. I'm using StructLayout Pack=2 or Pack=4 (depends on the exact struct) and change some field types in such way so the struct layout fits with original native struct layout for both x86 and x64 platform. Luckily it works for most usable structs. Here is example for linux:

    //original
    // sizeof(x86)=92:  [StructLayout(LayoutKind.Sequential, Pack=4)]
    // sizeof(x64)=136: [StructLayout(LayoutKind.Sequential, Pack = 8)]
    internal struct XWindowAttributes
    {
        public int x, y;
        public int width, height;
        public int border_width;
        public int depth;
        public IntPtr visual;
        public IntPtr root; 
        public int c_class;
        public int bit_gravity;
        public int win_gravity;
        public int backing_store;
        public UIntPtr backing_planes;
        public UIntPtr backing_pixel;
        [MarshalAs(UnmanagedType.Bool)]
        public bool save_under;
        public IntPtr colormap;
        [MarshalAs(UnmanagedType.Bool)]
        public bool map_installed;
        public int map_state;
        public IntPtr all_event_masks;
        public IntPtr your_event_mask;
        public IntPtr do_not_propagate_mask;
        [MarshalAs(UnmanagedType.Bool)]
        public bool override_redirect;
        public IntPtr screen;
    }

I replaced with the following universal struct layout:

    // Note: alignment hack, sizeof and fieldoffset are ok for x86 & x64
    // use private fields through union
    [StructLayout(LayoutKind.Sequential, Pack = 2)]
    internal struct XWindowAttributes
    {
        public int x, y;
        public int width, height;
        public int border_width;
        public int depth;
        public IntPtr visual;
        public IntPtr root;
        public int c_class;
        public int bit_gravity;
        public int win_gravity;
        public int backing_store;
        public UIntPtr backing_planes;
        public UIntPtr backing_pixel;
        //[MarshalAs(UnmanagedType.Bool)]
        //public bool save_under;
        private IntPtr save_under;
        public IntPtr colormap;
        [MarshalAs(UnmanagedType.Bool)]
        public bool map_installed;
        public int map_state;
        public IntPtr all_event_masks;
        public IntPtr your_event_mask;
        public IntPtr do_not_propagate_mask;
        //[MarshalAs(UnmanagedType.Bool)]
        //public bool override_redirect;  // Bool: sizeof(x86)=4, sizeof(x64)=4
        private IntPtr override_redirect;
        public IntPtr screen;
        private byte _hackSize;

As you can see I changed some bool types with IntPtr and added byte field and in such way I got struct layout which works ok for both linux platform x86 and x64.

I believe that this crash on x86 MacOS is related to some struct layout discrepancy... There is needs to check all struct layout for different platforms.

Mistik commented 1 year ago

Hi Barbosik

Got your email. Could you also email me your telegram so I can contact you?

Mistik commented 5 months ago

Hi Barbosik, please sent me your telegram so we can test the fix