golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
122.92k stars 17.52k forks source link

runtime: support dlclose with -buildmode=c-shared #11100

Open mattn opened 9 years ago

mattn commented 9 years ago
package main

import (
    "C"
    "fmt"
)

var (
    c chan string
)

func init() {
    c = make(chan string)
    go func() {
        n := 1
        for {
            switch {
            case n%15 == 0:
                c <- "FizzBuzz"
            case n%3 == 0:
                c <- "Fizz"
            case n%5 == 0:
                c <- "Buzz"
            default:
                c <- fmt.Sprint(n)
            }
            n++
        }
    }()
}

//export fizzbuzz
func fizzbuzz() *C.char {
    return C.CString(<-c)
}

func main() {
}

build this with

$ go build -buildmode=c-shared -o libfizzbuzz.so libfizzbuzz.go

then go

from ctypes import *
import _ctypes
lib = CDLL("./libfizzbuzz.so")
lib.fizzbuzz.restype = c_char_p
print lib.fizzbuzz()
print lib.fizzbuzz()
print lib.fizzbuzz()
print lib.fizzbuzz()
print lib.fizzbuzz()
print lib.fizzbuzz()
_ctypes.dlclose(lib._handle)
1
2
Fizz
4
Buzz
Fizz
Segmentation fault
minux commented 9 years ago

I don't understand. What do you expect otherwise? The code is still running, and you've unmapped its pages.

In general, it's impossible to dlclose a Go shared library (or a plugin).

mattn commented 9 years ago

eventhough close(c) and wait exiting goroutine, it reproduce.

On 6/7/15, Minux Ma notifications@github.com wrote:

I don't understand. What do you expect otherwise? The code is still running, and you've unmapped its pages.


Reply to this email directly or view it on GitHub: https://github.com/golang/go/issues/11100#issuecomment-109637904

minux commented 9 years ago

You can't stop the runtime, which uses its own OS threads.

ianlancetaylor commented 9 years ago

I think this is a legitimate feature request. Although we currently do not support calling dlclose on a Go shared library, and it would be difficult to make it work, it is not fundamentally impossible.

minux commented 9 years ago

It's possible if the user can be certain it doesn't hold onto any Go objects and all resources allocated by Go code has been freed (esp. no background goroutines).

However, as there is no way to kill a goroutine, I think all sufficiently sophisticated Go shared library will not be unloadable.

For example, the os/signal package contains a background goroutine to check for newly arrived signals. Any use of os/signal.Handle will leave the Go shared library un-unloadable. I'm sure there are other cases.

What's more, the runtime can't reliably detect whether it's safe to unload the Go shared library, so rather than make dlclose potentially trigger segmentation fault later, I'd rather make dlclose always fail and document that.

mattn commented 9 years ago

This also reproduce.

package main

import (
        "C"
        "fmt"
)

var (
        c chan string
        q chan struct{}
)

func init() {
        c = make(chan string)
        q = make(chan struct{})
        go func() {
                defer func() {
                        recover()
                        q <- struct{}{}
                }()
                n := 1
                for {
                        switch {
                        case n%15 == 0:
                                c <- "FizzBuzz"
                        case n%3 == 0:
                                c <- "Fizz"
                        case n%5 == 0:
                                c <- "Buzz"
                        default:
                                c <- fmt.Sprint(n)
                                println("stop")
                        }
                        n++
                }
        }()
}

//export fizzbuzz
func fizzbuzz() *C.char {
        return C.CString(<-c)
}

//export finish
func finish() {
        close(c) // occur panic of sending closed channel in above
        <-q      // wait goroutine
}

func main() {
}
from ctypes import *
import _ctypes
lib = CDLL("./libfizzbuzz.so")
lib.fizzbuzz.restype = c_char_p
print lib.fizzbuzz()
print lib.fizzbuzz()
print lib.fizzbuzz()
print lib.fizzbuzz()
print lib.fizzbuzz()
print lib.fizzbuzz()
lib.finish()
_ctypes.dlclose(lib._handle)
mattn commented 9 years ago

i added runtime.LockOSThread() in top of init() but not.

gcatlin commented 9 years ago

I have a need for this functionality too. My goal is to be able to make changes to a running Go game engine without needing to completely reload the game.

The implementation idea is to have 2 layers, the platform layer and the game layer.

The platform layer is GOOS/GOARCH-specific and is written in C. It is responsible for the game loop wherein it gathers controller input and manages image and sound buffers that it provides to the game layer for writing. It uses OS-provided functionality to output graphics and sound.

The game layer is platform agnostic, contains game logic only, and is written in Go. It is called by the platform layer once per iteration of the game loop. The game layer takes the controller input and buffers provided by the platform layer, updates the game state, and writes to the buffers.

The game layer is a shared library, built using -buildmode=c-shared, that can be edited and recompiled at any time. When the platform layer detects that the shared library was modified, it unloads the previous version of the library (dlclose) then loads the new version of the library (dlopen).

See https://gist.github.com/gcatlin/e09359f6e53f37e74a82

I'm trying this on darwin/amd64 and getting the following error when calling dlopen on the shared library for the second time (i.e. after detecting that the shared library has changed):

runtime/cgo: could not obtain pthread_keys
    tried 0x101 0x102 0x103 0x104 0x105 0x106 0x107 0x108 0x109 0x10a 0x10b 0x10c 0x10d 0x10e 0x10f 0x110 0x111 0x112 0x113 0x115 0x116 0x117 0x118 0x119 0x11a 0x11b 0x11c 0x11d 0x11e 0x11f 0x120 0x121 0x122 0x123 0x124 0x125 0x126 0x127 0x128 0x129 0x12a 0x12b 0x12c 0x12d 0x12e 0x12f 0x130 0x131 0x132 0x133 0x134 0x135 0x136 0x137 0x138 0x139 0x13a 0x13b 0x13c 0x13d 0x13e 0x13f 0x140 0x141 0x142 0x143 0x144 0x145 0x146 0x147 0x148 0x149 0x14a 0x14b 0x14c 0x14d 0x14e 0x14f 0x150 0x151 0x152 0x153 0x154 0x155 0x156 0x157 0x158 0x159 0x15a 0x15b 0x15c 0x15d 0x15e 0x15f 0x160 0x161 0x162 0x163 0x164 0x165 0x166 0x167 0x168 0x169 0x16a 0x16b 0x16c 0x16d 0x16e 0x16f 0x170 0x171 0x172 0x173 0x174 0x175 0x176 0x177 0x178 0x179 0x17a 0x17b 0x17c 0x17d 0x17e 0x17f 0x180 0x181
fatal error: cgo callback before cgo call

Is there a different way to achieve this with Go?

z505 commented 7 years ago

Minux said:

It's possible if the user can be certain it doesn't hold onto any Go objects and all resources allocated by Go code has been freed (esp. no background goroutines).

However, as there is no way to kill a goroutine, I think all sufficiently sophisticated Go shared library will not be unloadable.

Hi, so how does go know it is safe for the exe/elf to close if a user kills the application or hits the close button on a win32 gui window... There must be some way of finding out if it is safe? or go program just exits badly with hanging data around?
Or how do you halt/exit a go app safely (an exe, not a dll)..? Would exiting a go app (exe) safely be the same as unloading a dll safely?

The go runtime does not keep a reference count of the number of goroutines currently open? This should not be a performance hit to keep track of N number of goroutines open? Each goroutine opened increments a ref count by 1. But maybe there is already something implemented like this, that could be tapped into and used for dll's to know if all goroutines are finished.

Minux says:

For example, the os/signal package contains a background goroutine to check for newly arrived signals.

Can you increment a reference counter and unload the library later once all signals are finished (goroutine ref count hit zero)? How does the exe/elf normally exit safely and kill the program if goroutines could be still running? Simply make the DLL unload function act like how a normal go exe program safely exits? easier said than done, likely :-)

glycerine commented 7 years ago

Minux wrote,

You can't stop the runtime, which uses its own OS threads.

This is the root of the problem, @z505. I don't believe there is any graceful shutdown of the runtime threads under normal process exit/termination. Hence they may still have references to DLL memory. So currently there's no safety problem with process termination; just doing an exit() call, it simply unceremoniously stops the process, and doesn't need to do any graceful cleanup.

or go program just exits badly with hanging data around?

Yep.

brutestack commented 6 years ago

I'm using Go c-shared library in my Unity3D game. Unity itself does not support unloading libraries, but when you try to exit Unity Editor or Unity based game it starts waiting untill all libraries complete their work (wait untill all threads stop)

In case of using Go c-shared library which creates at least one goroutine Unity will wait forever and will not be able to exit. This will not happen if library does not utilize goroutines. So, go creates thead for my goroutine and never terminates it even if there is nothing to do in this thread.

Here is sample library code to reproduce:

package main

import "C"
import "time"

//export TestLib
func TestLib() {
    go func() {
        for i := 0; i < 10; i++ {
            fmt.Println("testlib.dll is here")
            time.Sleep(3 * time.Second)
        }
        fmt.Println("testlib.dll thread Done")
    }()
}

func main() {
}

I'm compiling that library on Linux that way: env GOOS=windows GOARCH=amd64 CGO_ENABLED=1 CXX=x86_64-w64-mingw32-g++ CC=x86_64-w64-mingw32-gcc go build -o $(go env GOPATH)/lib/win64/testlib.dll -buildmode c-shared unity3d.com/testlib

Here is c# class that utilizes this library:

using System;
using System.Runtime.InteropServices;
using UnityEngine;

public class TestLibrary : IDisposable {
    public TestLibrary()
    {
        ServerProcess();
    }

    private void ServerProcess()
    {
        TestLib();
    }

    public void Dispose()
    {
    }

    [DllImport ("testlib")]
    private extern static void TestLib ();

}

Unity is besed on Mono, that's why c#.

There is a very-very dirty trick that fixes my problem and allows to exit my game without wating forever - I have implemented function in my Go library that causes panic and library crashes when I need to stop everything and exit game. Actually nobody should do that, but it works. Additional code in Go library:

//export Panic
func Panic() {
    var panicChan chan bool
    close(panicChan)
}

Wortst of it is that I don't know what exactly crashes after calling Panic() from C# - Unity(Mono) or Go library, but I suspect Unity. Anfortunately crash is the only way to exit Unity game that uses Go c-shared library with goroutines...

Another workaround is to compile Go code as executable and run it in separate process, but this is not sutible for iOS version of my game because iOS does not let executing separate processes (even included in the same application boundle) without using private APIs

glycerine commented 5 years ago

I'll attempt this. Background from https://groups.google.com/forum/#!topic/golang-nuts/L-tby34r5Gs

Ian wrote

Jason wrote> Where does the source for the runtime scheduler and garbage collector live these days? The central locations are runtime/proc.go and runtime/mgc.go. Jason wrote> Wherefore, I need to locate all runtime background threads and add in a means to shut them down upon request. That's not currently supported, but it may be possible to modify the scheduler to do it. There is no simple way.

Ian

glycerine commented 5 years ago

After some thinking, I realized that the main use case is really during development: to unload a Go DLL, and then re-load a modified version of that DLL; this being done during coding and evolving the DLL. We don't really want to stop the runtime, because we'll just then need to restart it upon the re-load of the newer version of the DLL.

So I propose the following, more general, approach to supporting dlclose with -buildmode=c-shared:

a) On windows, additionally during build of Go from source, build the runtime as a distinct DLL. Once the runtime DLL is loaded, it will never be unloaded, even if client Go DLLs that depend on the runtime are unloaded. This solves the tricky part of trying to halt the runtime cleanly, because we don't need to do that after all.

b) On windows, build Go DLLs as libraries that are clients of the Go runtime DLL. Each client Go DLL will dynamically load the Go runtime DLL if it is not already loaded, taking care prevent race on load by some means when there are two or more Go DLLs loaded during process start (I expect this to be a common race; perhaps the first to claim a pre-agreed upon localhost port wins the race and gets to load the runtime DLL from some well known location in GOROOT). Each Go DLL will increment the reference count on the runtime twice, so that Windows never unloads the runtime DLL, even if the client Go DLL is unloaded by a dlclose() call. A distinct new buildmode may be indicated, to distinguish it from c-shared which currently bundles another copy of the runtime into every DLL, and probably needs to continue to do so for backwards compatibility. Suggestion: buildmode=common-runtime-dll.

c) the extra benefit: now multiple DLLs when loaded all share the same runtime, and so the possibility of communicating via channels between DLLs becomes viable.

@ianlancetaylor @alexbrainman @minux and anyone else with wisdom to contribute: Feedback welcome.

dchest commented 5 years ago

@glycerine that may be the main use case for some, but others are writing shared libraries in Go to plug into other programs (e.g. see @brutestack's case above, or from my personal experience and @mattn's description for this issue, a library that is used from other programming languages) and run them in production, where they don't control how the library is used. I think your proposal, while useful for some cases, doesn't solve the issue.

glycerine commented 5 years ago

@dchest: This issue is about dlclose() support. The plan is perfectly compatible with plugging into non Go host programs. If you don’t understand why, ask about the specific point where you are confused, so that we can clarify. Also divide and conquer wise, we can’t do everything at once, so I’m okay with no runtime shutdown just prior to program termination. Especially as a workaround is already posted above.

ianlancetaylor commented 5 years ago

@glycerine What you sketch out above seems like what you should get from

go install -buildmode=shared runtime
go build -buildmode=shared -linkshared PKG

(except for the double incrementing of the runtime package).

It might be interesting to see how well that works.

alexbrainman commented 5 years ago

with wisdom to contribute

I don't have anything wise to contribute. Sorry.

Alex

glycerine commented 5 years ago

Thanks Ian, thanks Alex.

Good lead! We've started a little repo to track what we've got working (https://github.com/glycerine/guestdll). We managed to build the runtime and guest DLLs. Currently stumbling on a missing main.init symbol.

go install -buildmode=shared runtime sync/atomic  ## creates /usr/local/go/pkg/linux_amd64_dynlink/libruntime,sync-atomic.so
# python to try the load:
from ctypes import *
import _ctypes
runtime = '/usr/local/go/pkg/linux_amd64_dynlink/libruntime,sync-atomic.so'
CDLL(runtime)
## OSError: /usr/local/go/pkg/linux_amd64_dynlink/libruntime,sync-atomic.so: undefined symbol: main.init

I suspect we will need to tell the runtime not to invoke main.init and main.main somehow.

Update: see the readme, https://github.com/glycerine/guestdll/blob/master/README.md , for the latest progress. We aren't sure if we are initializing the runtime correctly. We assumed, perhaps incorrectly, that we could start it just as a regular exe started it. But we might need to start somewhere else. In other words, instead of at _rt0_amd64_linux, the executable entry point pointed to by the ELF header.

brutestack commented 5 years ago

@glycerine, your approach is good during development process, but it doesn't solve problems in production when DLLs are included in third party software (for example game engine) when you don't control dlclose at all, as @dchest mentioned. In my case Unity3D calls dlclose on application exit for both dlls (runtime and package) and hangs forever waiting untill runtime threads stop, which will never happen.

glycerine commented 5 years ago

@brutestack There's never any expectation of "controlling" dclose, and we always expect to be hosted by non-Go code. Waiting around after dlclose() returns is Unity's bug, for which you posted a reasonable workaround. Ideally we won't need to stop the runtime. Nevertheless, I note it may be quite doable if it becomes unavoidable:

During sweep termination and mark termination, the garbage collector already has a mechanism in use to "stop-the-world". So this would seem like an ideal time to also check if a dlclose() is in progress. I assume though that stopping the world also means stopping the runtime background threads. It may not include them, perhaps if they are not users of garbage collection. Nonetheless, this would seem like a natural place to kill threads when a dlclose is indicated.

ianlancetaylor commented 5 years ago

Stopping the world does not stop the sysmon thread. The sysmon thread is written to work even if the garbage collector has stopped the world.

glycerine commented 5 years ago

@zdjones For full shutdown, in addition to GC stop-the-world, the path that the panic activity itself takes may be a useful example to study, as per the workaround above it does seem to stop most everything. Specifically, I'm looking at the variables panicking and runPanicDefers in runtime/proc.go.

zdjones commented 5 years ago

@brutestack I don't have any experience with it, but have you looked into declaring a routine in your DLL specifying __attribute__((destructor))? It sounds like it may fit your use case.

dlclose man gcc function attributes

brutestack commented 5 years ago

@zdjones, what should I do inside destructor to stop all go threads legally? Unity3D engine hangs before exit because it waits all child threads (even in libraries) to stop, but threads serving goroutines never stop even when all goroutines done and even after dlclose. @glycerine may be right, Unity3D should not wait and this is a bug. But I'm sure Go must have some legal way to stop all thread pools and do other clean up things legally (without panic).

glycerine commented 5 years ago

But I'm sure Go must have some legal way to stop all thread pools and do other clean up things legally (without panic).

@brutestack There is not, presently. Panic is the closest we've got. The Go runtime is just not very used to being inside a library.

So I suspect that we'll have to add a method to indicate whether full shutdown is needed, or if the runtime should keep going, upon dlclose(). But one step at a time. We're currently trying to figure out how to get the runtime initialized after it has been dynamically loaded at runtime.

If manual halt of threads is required, then we actually have to figure out how to do that, which as you can probably guess from the above, will take some surgery. Linux C libraries use pthread_cancel, pthread_exit, or have the thread return from its starting routine. I'm not sure what the equivalent raw system calls are that don't use pthreads. Probably have to read the pthreads source. On Windows it is more obvious, as the OS provides a TerminateThread() API.

Note to self: we may well also need to add a mechanism to have the runtime not squash the host's signal handlers. I had to deal with this before when embedding a Go .so library inside R. R, as a C host, expects to receive SIGINT which the Go runtime, at least by default, overwrites. (c.f. https://github.com/glycerine/rmq/blob/master/src/cpp/interface.cpp#L41 through L80)

ianlancetaylor commented 5 years ago

On Unix systems you can kill a thread (not, of course, a goroutine) by sending it a SIGKILL signal, via whatever system call is invoked by pthread_kill (on GNU/Linux this is tgkill). Or you can kill it less aggressively by sending it some other signal.

The guidelines for signal handlers can be seen at https://golang.org/pkg/os/signal/#hdr-Non_Go_programs_that_call_Go_code . It should work already, and it's unlikely that adding a knob will help.

glycerine commented 5 years ago

Thanks Ian!

The guidelines for signal handlers can be seen at https://golang.org/pkg/os/signal/#hdr-Non_Go_programs_that_call_Go_code .

"Go code built with -buildmode=c-archive or -buildmode=c-shared will not install any other signal handlers by default."

Since this doesn't specifically mention -buildmode=shared, I assume that we will just need to make shared work like c-shared? I see alot of places in runtime/proc.go where c-shared and c-archive are special cased, but little-to-no mention of shared handling.

ianlancetaylor commented 5 years ago

For -shared I think you have to ensure that all the signal handlers are installed by a runtime that will never be closed.

glycerine commented 5 years ago

For -shared I think you have to ensure that all the signal handlers are installed by a runtime that will never be closed.

Ian, would you mind elaborating on this? Why would signal handlers prevent a -shared runtime from being shut down? Could the signal handlers not be uninstalled first?

ianlancetaylor commented 5 years ago

What I mean is: a -buildmode=shared runtime is intended to be run by a Go program. The main Go program should be handling the signal handlers. The -buildmode=shared shared library should not be handling the signal handlers. (Unless the -buildmode=shared library is itself the instance of the runtime package used by the main program, in which case shutting it down is tantamount to exiting the program.)

glycerine commented 5 years ago

a -buildmode=shared runtime is intended to be run by a Go program.

Ah. There's where I was thinking differently. We used -buildmode=shared to build separate go-runtime.dll and go-client.dll, so I was thinking that these would both be loaded by the host C program. I suppose that's not viable? Is it possible to have separate go-runtime.dll and goclient.dll that are buildmode=c-shared ?

Windows is truly bizarre. Apparently it is indeed possible to lock a DLL permanantly in memory, even after the host program exits... The author of this blog uses my double-reference count suggestion to keep his DLL in memory... and then says there is even a stronger approach -- one immune to double FreeLibrary calls -- by pinning the DLL to thread 0 so it stays in memory while the current desktop is alive... just wow.

https://blogs.msmvps.com/vandooren/2006/10/09/preventing-a-dll-from-being-unloaded-by-the-app-that-uses-it/

"The right way that I described is indeed the right way on pre-XP systems. On XP or later you can use GetModuleHandleEx with the GET_MODULE_HANDLE_EX_FLAG_PIN flag to prevent unloading of the DLL. The advantage is that the calling app cannot unload the DLL by calling FreeLibrary twice. The disadvantage is that you cannot unload the DLL anymore, even if you should want to."

ianlancetaylor commented 5 years ago

It sounds like you are thinking about using -buildmode=shared to build a shared library that is then shared by other Go packages built using -buildmode=c-shared -linkshared. That might work to some extent but I agree that at present signal handlers will not work well.

There is no way at present to separate the Go runtime and Go client into separate shared libraries built with -buildmode=c-shared. The intent with c-shared is to provide a self-contained library that a C program can use.

I guess I'm not sure what real advantage you get by splitting out the Go runtime into a separate shared library. It's theoretically interesting but I don't know who would want to do that in practice.

glycerine commented 5 years ago

I guess I'm not sure what real advantage you get by splitting out the Go runtime into a separate shared library. It's theoretically interesting but I don't know who would want to do that in practice.

The main advantage I was after in building the runtime as a separate library was that two or more Go DLLs (both clients of the go-runtime.dll) loaded into one process could communicate (say over channels), because they would share the same runtime.

ianlancetaylor commented 5 years ago

But since they are c-shared, they can only provide a C API. Returning a channel from a Go function exported by a c-shared library is not permitted by the pointer-passing rules, so the only way you could get a channel from one library to the other would be through shenanigans.

glycerine commented 5 years ago

I was thinking of shared libraries that would provide Go API to Go callers in the same address space.

The other big reason for thinking along these lines -- of having the Go runtime library separate from the user-written guest library -- is dlclose(); the idea being that if the guest/client Go DLLs can be closed separately, but have the runtime persist, that might solve the crashes (presumably due to the runtime having threads that are going when suddently their code pages disappear). I think this would work as long as the guest library can shutdown its own goroutines. For example, when DllMain() is called with DLL_PROCESS_DETACH the guest/user library shuts down all its goroutines before returning. So long as the runtime still has its code pages mapped, its code would not crash presumably.

Addendum: and then if the guest reloads (expected often), the guest could just plug into the same runtime again, presumably starting up faster.

ianlancetaylor commented 5 years ago

If there is a single shared runtime library, how can we decide which goroutines are associated with which library? Say we send a function across a channel and start a goroutine that runs a loop calling that function. In general a single goroutine might cross between code in different libraries. Once we start permitting that, I don't see how we can ever release a library completely.

glycerine commented 5 years ago

That's hard to argue with.

So at the moment the solution/workaround for "supporting" dlclose() seems to be to allow it but to not really close. On Windows, at any rate, we can do this by pinning the shared library into memory using one or more of the technique(s) cited above from https://blogs.msmvps.com/vandooren/2006/10/09/preventing-a-dll-from-being-unloaded-by-the-app-that-uses-it/ and then, on actual full process shutdown, if need be to address Unity bugs, use the panic() approach, and hope we are last to unload since nobody else will get to unload after us (the process is gone after a panic).

Since this approach addresses my needs, I'm not going to attempt anything more elaborate.

typeless commented 5 years ago

go install -buildmode=shared runtime go build -buildmode=shared -linkshared PKG

I tested with go install -buildmode=shared runtime sync/atomic and go build -buildmode=shared -linkshared log and it built successfully. But what surprised me is that the size of the liblog.so:

-rw-rw-r-- 1 mura mura 8.7M May  6 10:45 liblog.so
Dids commented 4 years ago

Any updates on this?

In my specific case, I was hoping to use the plugin system for a kind of "hot module replacement" system, with the idea being that you can edit any "module", which will automatically trigger a recompilation of the plugin, as well as "reloading" the plugin after it has successfully compiled. All without having to restart the main Go application. I know I could achieve the same with various scripting languages (or even RPC), but I'd much rather do everything with Go.

So far the only workaround I've found is to version the plugins, even just incrementing them works. This of course has the downside of previous plugin versions not being unloaded (as far as I know?), but considering these "modules" would be very constrained and small in my case, I'm wondering if this would still be acceptable?

ianlancetaylor commented 4 years ago

There are no updates. Any updates will appear on this issue.

jtarchie commented 3 years ago

Is this a platform specific issue? I'm on Mac Intel and have not been able to reproduce this at all.

I've tried the original implementation (using go 1.17) and using it as a shared library in Ruby, LuaJIT, and Python. It works fine, no Segfault.

package main

import (
    "C"
    "fmt"
)

var (
    c chan string
)

func init() {
    c = make(chan string)
    go func() {
        n := 1
        for {
            switch {
            case n%15 == 0:
                c <- "FizzBuzz"
            case n%3 == 0:
                c <- "Fizz"
            case n%5 == 0:
                c <- "Buzz"
            default:
                c <- fmt.Sprint(n)
            }
            n++
        }
    }()
}

//export fizzbuzz
func fizzbuzz() *C.char {
    return C.CString(<-c)
}

func main() {}

Compile into shared Library.

go build -buildmode=c-shared -o libfizzbuzz.so libfizzbuzz.go

Then run in Ruby 3.0.0

require 'fiddle'

libfb = Fiddle.dlopen("libfizzbuzz.so")
fb    = Fiddle::Function.new(
  libfb['fizzbuzz'],
  [],
  Fiddle::TYPE_VOIDP
)

puts fb.call
puts fb.call
puts fb.call
puts fb.call
puts fb.call

libfb.close
nightlark commented 3 years ago

Is this a platform specific issue? I'm on Mac Intel and have not been able to reproduce this at all.

I tried it on WSL2, and it didn't segfault -- though it doesn't seem to be closing properly.

Here's a simple C shared library I compiled to compare against the behavior of the Go fizzbuzz program:

int c = 0;
int notfizzbuzz() {
        c = c + 1;
        return c;
}

With this C shared library, I'm able to dlclose the library in Python3 and then reopen it, and the numbers start counting from 0 again. If I dlclose the library and then try to call the function imported from the library, I get a segfault (as expected).

With the Go shared library, after calling dlclose if I open the library again, the numbers returned from fizzbuzz don't restart, they just keep counting from where it left off previously. In addition to that, if I dlclose the shared library then I am still able to call lib.fizzbuzz() and it just keeps counting as if the library had not been closed.

jtarchie commented 3 years ago

This might be Windows specific.

nightlark commented 3 years ago

@jtarchie it doesn't appear to be Windows specific, I tried the same test on Linux -- on both platforms, calling dlclose is not actually closing the handle to the shared library written in Go, but does close the handle for the library written in C.

I don't use Ruby, but I think the equivalent for Ruby of what I tried in Python would be:

require 'fiddle'

libfb = Fiddle.dlopen("libfizzbuzz.so")
fb    = Fiddle::Function.new(
  libfb['fizzbuzz'],
  [],
  Fiddle::TYPE_VOIDP
)

puts fb.call
puts fb.call
puts fb.call
puts fb.call
puts fb.call

libfb.close

puts fb.call

What I did in Python was:

from ctypes import *
import _ctypes
lib = CDLL("./libfizzbuzz.so")
lib.fizzbuzz.restype = c_char_p
print(lib.fizzbuzz())
print(lib.fizzbuzz())
print(lib.fizzbuzz())
print(lib.fizzbuzz())
_ctypes.dlclose(lib._handle)

print(lib.fizzbuzz())

lib = CDLL("./libfizzbuzz.so")
lib.fizzbuzz.restype = c_char_p
print(lib.fizzbuzz())

And the result was:

1
2
Fizz
4
Buzz
Fizz

The C version segfaults as expected after calling dlclose.

tmm1 commented 2 years ago

Looks like dlclose is currently ignored:

https://github.com/golang/go/blob/70b1a45425a5e456b4e347e96fc94f94b04ce121/src/cmd/link/internal/ld/lib.go#L1346-L1348

It sounds like from this thread, an approximate solution would be to:

Does that sound right?

ianlancetaylor commented 2 years ago

@tmm1 Not really. Consider https://github.com/golang/go/issues/11100#issuecomment-488364955. Consider what should happen for a goroutine that is currently sitting in C code; if we remove the Go shared library then when the C code returns it will crash or (if some other shared library has been opened) behave unpredictably. Note that there is no separate GC thread; in the current Go runtime GC is handled by ordinary goroutines.

tmm1 commented 2 years ago

I understand there are a lot of complexities and edge cases.

In my case I am working in a plugin environment where I have a lot of control and can ensure that no goroutines will be busy calling into C code or other shared libraries. I have the ability to make sure all my code is done before calling dlclose, so I'm wondering what's required/possible in that more limited scenario.

I suppose parts of the runtime internally could be calling into libc even if my user code is not. What I observe currently is that many of the OS threads related to the runtime start busy looping after I dlclose. I'll try to make a list of them and kill them to see if that helps.

tmm1 commented 2 years ago

After looking more into this today, it seems there is no way to kill a thread. Even if you use pthread_kill (i.e tgkill), the signal will be delivered to that particular thread but the process as a whole will be affected by it if its a stop/terminate/kill signal.

What I observe currently is that many of the OS threads related to the runtime start busy looping after I dlclose.

I observed this behavior on macOS, where ps -M showed several threads at 100% cpu after dlclose. Using sample on the pid shows unknown backtraces, and lsof -nPp confirmed that the golang dylib was no longer mapped causing those runtime threads to be very confused.

It turns out -Wl,-z,nodelete does not work on macOS.

I found instead that using dlopen with RTLD_NODELETE has the same effect, and fixed the cpu usage issues I was having.

On Windows, a similar GetModuleHandleEx(GET_MODULE_HANDLE_EX_FLAG_PIN, ...) is required for the same effect.

My solution for now is to simply allow old copies of the runtime and my code to stay resident in memory. When my plugin updates, I load a new dylib/so with a fresh copy of the runtime and my code. These copies of the golang runtime seem to be happy residing side by side. The old runtimes are basically doing nothing anyway.

Note for anyone else attempting something similar: on macOS, be sure to pass RTLD_LOCAL (which is the default on Linux but not macOS). Without this flag, the handle returned by dlopen will be stale after renaming a dylib and moving a new one in its place. Similarly on Windows, LoadLibraryEx will return an old handle for a given full path to a dll, even if the dll has changed on disk. You must use a unique filename per version of the dll to be able to load multiple copies at the same time.

iuridiniz commented 2 years ago

This issue is about to get unloadable modules written in golang for some app that supports loadable modules.

During the development process, it's good do have this, in production, this could be useful if someone requires hot upgrades, but since that go runtime (threads) does not shutdown in dlclose, we don't have this.

I'm doing a hello world module example for freeswitch and I have this issue. The only thing I can do for now (golang <1.17) is to mark the module as unloadable.

Module: https://github.com/iuridiniz/freeswitch_module_golang_sample

danieldonoghue commented 1 year ago

@tmm1 Not really. Consider #11100 (comment). Consider what should happen for a goroutine that is currently sitting in C code; if we remove the Go shared library then when the C code returns it will crash or (if some other shared library has been opened) behave unpredictably. Note that there is no separate GC thread; in the current Go runtime GC is handled by ordinary goroutines.

@ianlancetaylor surely that's a problem for the c developer? it would be incumbent on them to stop the goroutine prior to closing the library; failing to do so would leave their application in an indeterminate state, as it would whatever resource they were trying to use after closing it?