golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.98k stars 17.67k forks source link

runtime: Windows service lifecycle events behave incorrectly when called within a golang environment #40167

Closed JohanKnutzen closed 4 years ago

JohanKnutzen commented 4 years ago

What version of Go are you using (go version)?

$ go version
go version go1.14.3 windows/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
set GO111MODULE=
set GOARCH=amd64
set GOBIN=
set GOCACHE=C:\Users\knutz\AppData\Local\go-build
set GOENV=C:\Users\knutz\AppData\Roaming\go\env
set GOEXE=.exe
set GOFLAGS=
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOINSECURE=
set GONOPROXY=
set GONOSUMDB=
set GOOS=windows
set GOPATH=C:\Users\knutz\go
set GOPRIVATE=
set GOPROXY=https://proxy.golang.org,direct
set GOROOT=c:\go
set GOSUMDB=sum.golang.org
set GOTMPDIR=
set GOTOOLDIR=c:\go\pkg\tool\windows_amd64
set GCCGO=gccgo
set AR=ar
set CC=gcc
set CXX=g++
set CGO_ENABLED=1
set GOMOD=C:\tmp\test2\svcshutdownbug\go.mod
set CGO_CFLAGS=-g -O2
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-g -O2
set CGO_FFLAGS=-g -O2
set CGO_LDFLAGS=-g -O2
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-m64 -mthreads -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=C:\Users\knutz\AppData\Local\Temp\go-build509631438=/tmp/go-build -gno-record-gcc-switches

What did you do?

Upon further narrowing of a minimal example that reproduces 40157, it seems that the golang runtime is preventing emitting of SERVICE_CONTROL_SHUTDOWN events.

I've written a program in c as well as in golang that logs this event. The order of syscalls are identical.

Source code

main.go ```golang package main import ( "fmt" "os" "syscall" "unsafe" ) const () const ( SERVICE_WIN32_OWN_PROCESS = uint32(0x00000010) SERVICE_CONTROL_STOP = uint32(0x00000001) SERVICE_CONTROL_SHUTDOWN = uint32(0x00000005) SERVICE_ACCEPT_STOP = uint32(0x00000001) SERVICE_ACCEPT_SHUTDOWN = uint32(0x00000004) SERVICE_STOPPED = uint32(0x00000001) SERVICE_START_PENDING = uint32(0x00000002) SERVICE_STOP_PENDING = uint32(0x00000003) SERVICE_RUNNING = uint32(0x00000004) ) type SERVICE_TABLE_ENTRY struct { name *uint16 proc uintptr } type SERVICE_STATUS struct { dwServiceType uint32 dwCurrentState uint32 dwControlsAccepted uint32 dwWin32ExitCode uint32 dwServiceSpecificExitCode uint32 dwCheckPoint uint32 dwWaitHint uint32 } var ( advapi32 = syscall.NewLazyDLL("Advapi32.dll") procStartServiceCtrlDispatcher = advapi32.NewProc("StartServiceCtrlDispatcherW") procRegisterServiceCtrlHandlerEx = advapi32.NewProc("RegisterServiceCtrlHandlerExW") procSetServiceStatus = advapi32.NewProc("SetServiceStatus") ) const svcName = "svctest" var stopped = make(chan bool) func log(text string) { filename := os.TempDir() + "\\" + "svctest.log" f, err := os.OpenFile(filename, os.O_APPEND|os.O_CREATE|os.O_WRONLY, 0600) if err != nil { panic(err) } defer f.Close() if _, err = f.WriteString(text); err != nil { panic(err) } } func ctrlHandlerEx(ctrlCode uint32, eventType uint32, eventData uintptr, context uintptr) uintptr { switch ctrlCode { case SERVICE_CONTROL_STOP: log(fmt.Sprintf("SERVICE_CONTROL_STOP\n")) stopped <- true case SERVICE_CONTROL_SHUTDOWN: log(fmt.Sprintf("SERVICE_CONTROL_SHUTDOWN\n")) stopped <- true } return 0 } func serviceMain(argc uint32, argv **uint16) uintptr { statusHandle, _, err := procRegisterServiceCtrlHandlerEx.Call(uintptr(unsafe.Pointer(syscall.StringToUTF16Ptr(svcName))), syscall.NewCallback(ctrlHandlerEx), uintptr(0)) if 0 == statusHandle { log(fmt.Sprintf("Error calling StartServiceCtrlDispatcherW: %v\n", err)) return 0 } status := SERVICE_STATUS{ dwServiceType: SERVICE_WIN32_OWN_PROCESS, dwControlsAccepted: (SERVICE_ACCEPT_STOP | SERVICE_ACCEPT_SHUTDOWN), dwCurrentState: SERVICE_START_PENDING, } ret, _, err := procSetServiceStatus.Call(statusHandle, uintptr(unsafe.Pointer(&status))) if ret == 0 { log(fmt.Sprintf("Error calling SetServiceStatus: %v\n", err)) } status.dwCurrentState = SERVICE_RUNNING ret, _, err = procSetServiceStatus.Call(statusHandle, uintptr(unsafe.Pointer(&status))) if ret == 0 { log(fmt.Sprintf("Error calling SetServiceStatus: %v\n", err)) } <-stopped status.dwCurrentState = SERVICE_STOPPED ret, _, err = procSetServiceStatus.Call(statusHandle, uintptr(unsafe.Pointer(&status))) if ret == 0 { log(fmt.Sprintf("Error calling SetServiceStatus: %v\n", err)) } return 0 } func main() { t := []SERVICE_TABLE_ENTRY{ { name: syscall.StringToUTF16Ptr(svcName), proc: syscall.NewCallback(serviceMain), }, {name: nil, proc: 0}, } ret, _, err := procStartServiceCtrlDispatcher.Call(uintptr(unsafe.Pointer(&t[0]))) if ret == 0 { log(fmt.Sprintf("Error calling StartServiceCtrlDispatcherW: %v\n", err)) } } ```
main.c ```golang #include #include #include SERVICE_STATUS g_ServiceStatus = { 0 }; SERVICE_STATUS_HANDLE g_StatusHandle = NULL; HANDLE g_ServiceStopEvent = INVALID_HANDLE_VALUE; VOID WINAPI ServiceMain(DWORD argc, LPTSTR *argv); DWORD WINAPI ServiceWorkerThread(LPVOID lpParam); DWORD WINAPI ctrlHandlerEx(DWORD CtrlCode, DWORD eventType, LPVOID eventData, LPVOID context); #define SERVICE_NAME _T("svctest") VOID WINAPI ServiceMain(DWORD argc, LPTSTR *argv) { g_StatusHandle = RegisterServiceCtrlHandlerEx(SERVICE_NAME, ctrlHandlerEx, NULL); if (g_StatusHandle == NULL) { goto EXIT; } // Tell the service controller we are starting ZeroMemory( & g_ServiceStatus, sizeof(g_ServiceStatus)); g_ServiceStatus.dwServiceType = SERVICE_WIN32_OWN_PROCESS; g_ServiceStatus.dwControlsAccepted = SERVICE_ACCEPT_STOP | SERVICE_ACCEPT_POWEREVENT | SERVICE_ACCEPT_SESSIONCHANGE | SERVICE_ACCEPT_SHUTDOWN; g_ServiceStatus.dwCurrentState = SERVICE_START_PENDING; g_ServiceStatus.dwWin32ExitCode = 0; g_ServiceStatus.dwServiceSpecificExitCode = 0; g_ServiceStatus.dwCheckPoint = 0; if (SetServiceStatus(g_StatusHandle, &g_ServiceStatus) == FALSE) { //... } g_ServiceStopEvent = CreateEvent(NULL, TRUE, FALSE, NULL); if (g_ServiceStopEvent == NULL) { //... goto EXIT; } g_ServiceStatus.dwCurrentState = SERVICE_RUNNING; g_ServiceStatus.dwWin32ExitCode = 0; g_ServiceStatus.dwCheckPoint = 0; if (SetServiceStatus(g_StatusHandle, &g_ServiceStatus) == FALSE) { //... } HANDLE hThread = CreateThread(NULL, 0, ServiceWorkerThread, NULL, 0, NULL); WaitForSingleObject(hThread, INFINITE); CloseHandle(g_ServiceStopEvent); g_ServiceStatus.dwCurrentState = SERVICE_STOPPED; g_ServiceStatus.dwWin32ExitCode = 0; g_ServiceStatus.dwCheckPoint = 3; if (SetServiceStatus(g_StatusHandle, &g_ServiceStatus) == FALSE) { } EXIT: return; } void log(const char* str) { FILE *file = fopen("C:\\Windows\\Temp\\svctest.log", "a+"); if(0 == file) { return; } fprintf(file, str); fclose(file); } DWORD WINAPI ctrlHandlerEx(DWORD CtrlCode, DWORD eventType, LPVOID eventData, LPVOID context) { switch (CtrlCode) { case SERVICE_CONTROL_STOP: log("SERVICE_CONTROL_STOP\n\r"); SetEvent(g_ServiceStopEvent); break; case SERVICE_CONTROL_SHUTDOWN: log("SERVICE_CONTROL_SHUTDOWN\n\r"); SetEvent(g_ServiceStopEvent); break; default: break; } return 0; } DWORD WINAPI ServiceWorkerThread(LPVOID lpParam) { while (WaitForSingleObject(g_ServiceStopEvent, 0) != WAIT_OBJECT_0) { Sleep(3000); } return 0; } DWORD RunService() { SERVICE_TABLE_ENTRY serviceTable[] = { { SERVICE_NAME, ServiceMain }, { 0, 0 } }; if (StartServiceCtrlDispatcher(serviceTable)) { return 0; } else { DWORD erro = GetLastError(); return 1; } } int _tmain(int argc, TCHAR *argv[]) { RunService(); return 0; } ```

Building golang: go build C: cl main.c /link Advapi32.lib.

Running

  1. Run sc create svctest binpath= _path_ in an administrator cmd.exe, path should be an absolute path pointing to either the c executable or the golang executable.
  2. Run sc start svctest to start the service
  3. Run shutdown /s. This will shut down the computer. Do not shutdown using the start-menu.

Cleanup Run sc delete svctest to remove the service entry in Windows

What did you expect to see?

Both executables log SERVICE_CONTROL_STOP and SERVICE_CONTROL_SHUTDOWN to C:\Windows\Temp\svctest.log. The above steps should log a SERVICE_CONTROL_SHUTDOWN during shutdown of the computer.

What did you see instead?

Only the C program appropriately logs SERVICE_CONTROL_SHUTDOWN. You can test that both programs log SERVICE_CONTROL_STOP by right clicking on the service in the Task Manager and selecting Restart.

JohanKnutzen commented 4 years ago

@zx2c4 Do you have any ideas? Although unrelated, I remember that you had some insight regarding 31685

@alexbrainman I remember that you also had insight into the runtime

alexbrainman commented 4 years ago

@yohan1234

Thank you for creating this issue.

Is your issue dup of #40074 ?

Alex

JohanKnutzen commented 4 years ago

@alexbrainman

Thanks for the response.

It's not really a dupe since my repro doesn't use x/sys/windows/svc but makes direct syscalls. I would assume that the fix to this would solve both tho!

Johan

networkimprov commented 4 years ago

cc @aclements

JohanKnutzen commented 4 years ago

@alexbrainman

I've found a workaround.

It seems that the runtime hooks SetConsoleCtrl in windows which leads to the termination of the program when the CTRL_SHUTDOWN_EVENT is received. The runtime does this here:

os_windows.go

By overriding the console ctrl handler and the CTRL_SHUTDOWN_EVENT event, you can disable the golang handler for this particular event. Here is a modifed example that does that:

main.go ```golang package main import ( "fmt" "os" "syscall" "unsafe" ) const () const ( SERVICE_WIN32_OWN_PROCESS = uint32(0x00000010) SERVICE_CONTROL_STOP = uint32(0x00000001) SERVICE_CONTROL_SHUTDOWN = uint32(0x00000005) SERVICE_ACCEPT_STOP = uint32(0x00000001) SERVICE_ACCEPT_SHUTDOWN = uint32(0x00000004) SERVICE_STOPPED = uint32(0x00000001) SERVICE_START_PENDING = uint32(0x00000002) SERVICE_STOP_PENDING = uint32(0x00000003) SERVICE_RUNNING = uint32(0x00000004) ) type SERVICE_TABLE_ENTRY struct { name *uint16 proc uintptr } type SERVICE_STATUS struct { dwServiceType uint32 dwCurrentState uint32 dwControlsAccepted uint32 dwWin32ExitCode uint32 dwServiceSpecificExitCode uint32 dwCheckPoint uint32 dwWaitHint uint32 } var ( advapi32 = syscall.NewLazyDLL("Advapi32.dll") procStartServiceCtrlDispatcher = advapi32.NewProc("StartServiceCtrlDispatcherW") procRegisterServiceCtrlHandlerEx = advapi32.NewProc("RegisterServiceCtrlHandlerExW") procSetServiceStatus = advapi32.NewProc("SetServiceStatus") ) const ( CTRL_SHUTDOWN_EVENT = uint32(6) ) var ( kernel32 = syscall.NewLazyDLL("kernel32.dll") procSetConsoleCtrlHandler = kernel32.NewProc("SetConsoleCtrlHandler") ) func overrideCtrlShutdownEvent() error { n, _, err := procSetConsoleCtrlHandler.Call( syscall.NewCallback(func(controlType uint32) uint { if controlType == CTRL_SHUTDOWN_EVENT { log("CTRL_SHUTDOWN_EVENT\n") return 1 } return 0 }), 1) if n == 0 { return err } return nil } const svcName = "svctest" var stopped = make(chan bool) var ctrlShutdownEvent = make(chan bool) func log(text string) { filename := os.TempDir() + "\\" + "svctest.log" f, err := os.OpenFile(filename, os.O_APPEND|os.O_CREATE|os.O_WRONLY, 0600) if err != nil { panic(err) } defer f.Close() if _, err = f.WriteString(text); err != nil { panic(err) } } func ctrlHandlerEx(ctrlCode uint32, eventType uint32, eventData uintptr, context uintptr) uintptr { switch ctrlCode { case SERVICE_CONTROL_STOP: log(fmt.Sprintf("SERVICE_CONTROL_STOP\n")) stopped <- true case SERVICE_CONTROL_SHUTDOWN: log(fmt.Sprintf("SERVICE_CONTROL_SHUTDOWN\n")) stopped <- true } return 0 } func serviceMain(argc uint32, argv **uint16) uintptr { statusHandle, _, err := procRegisterServiceCtrlHandlerEx.Call(uintptr(unsafe.Pointer(syscall.StringToUTF16Ptr(svcName))), syscall.NewCallback(ctrlHandlerEx), uintptr(0)) if 0 == statusHandle { log(fmt.Sprintf("Error calling StartServiceCtrlDispatcherW: %v\n", err)) return 0 } status := SERVICE_STATUS{ dwServiceType: SERVICE_WIN32_OWN_PROCESS, dwControlsAccepted: (SERVICE_ACCEPT_STOP | SERVICE_ACCEPT_SHUTDOWN), dwCurrentState: SERVICE_START_PENDING, } ret, _, err := procSetServiceStatus.Call(statusHandle, uintptr(unsafe.Pointer(&status))) if ret == 0 { log(fmt.Sprintf("Error calling SetServiceStatus: %v\n", err)) } status.dwCurrentState = SERVICE_RUNNING ret, _, err = procSetServiceStatus.Call(statusHandle, uintptr(unsafe.Pointer(&status))) if ret == 0 { log(fmt.Sprintf("Error calling SetServiceStatus: %v\n", err)) } overrideCtrlShutdownEvent() <-stopped status.dwCurrentState = SERVICE_STOPPED ret, _, err = procSetServiceStatus.Call(statusHandle, uintptr(unsafe.Pointer(&status))) if ret == 0 { log(fmt.Sprintf("Error calling SetServiceStatus: %v\n", err)) } return 0 } func main() { t := []SERVICE_TABLE_ENTRY{ { name: syscall.StringToUTF16Ptr(svcName), proc: syscall.NewCallback(serviceMain), }, {name: nil, proc: 0}, } ret, _, err := procStartServiceCtrlDispatcher.Call(uintptr(unsafe.Pointer(&t[0]))) if ret == 0 { log(fmt.Sprintf("Error calling StartServiceCtrlDispatcherW: %v\n", err)) } } ```

Specifically this bit:

const (
    CTRL_SHUTDOWN_EVENT = uint32(6)
)

var (
    kernel32                  = syscall.NewLazyDLL("kernel32.dll")
    procSetConsoleCtrlHandler = kernel32.NewProc("SetConsoleCtrlHandler")
)

func overrideCtrlShutdownEvent() error {
    n, _, err := procSetConsoleCtrlHandler.Call(
        syscall.NewCallback(func(controlType uint32) uint {
            if controlType == CTRL_SHUTDOWN_EVENT {
                log("CTRL_SHUTDOWN_EVENT\n")
                return 1
            }
            return 0
        }),
        1)

    if n == 0 {
        return err
    }
    return nil
}

If you run the above code using the instructions in the original post, the output should be the following: CTRL_SHUTDOWN_EVENT SERVICE_CONTROL_SHUTDOWN

I would suggest that the golang runtime is modified to not implicitly call SetConsoleCtrlHandler. This is because the documented "default handling" of these kinds of events in windows is unique depending on the application. You have a different behavior if you've supplied a ServiceMain, use gdi32/user32, running different versions of windows and of course if you're a console program at all. Using this function should unfortunately probably be an explicit action when writing go programs for Windows.

SetConsoleCtrlHandler

zx2c4 commented 4 years ago

Or the service code can just call SetConsoleCtrlHandler(NULL, FALSE), to reset to the default handler:

image

gopherbot commented 4 years ago

Change https://golang.org/cl/242378 mentions this issue: windows/svc: restore default console shutdown notifier when starting service

zx2c4 commented 4 years ago

I just posted a patch to maybe fix this, but I haven't tested.

@yohan1234 - can you test and let me know whether or not this works?

JohanKnutzen commented 4 years ago

@zx2c4 - I've tried that because I thought that the docs were incorrect on msdn. Unfortunately, it only affects "Ctrl+C input" and not CTRL_SHUTDOWN_EVENT.

Here's a some modified code that tests that:

main.go ```golang package main import ( "fmt" "os" "syscall" "unsafe" ) const () const ( SERVICE_WIN32_OWN_PROCESS = uint32(0x00000010) SERVICE_CONTROL_STOP = uint32(0x00000001) SERVICE_CONTROL_SHUTDOWN = uint32(0x00000005) SERVICE_ACCEPT_STOP = uint32(0x00000001) SERVICE_ACCEPT_SHUTDOWN = uint32(0x00000004) SERVICE_STOPPED = uint32(0x00000001) SERVICE_START_PENDING = uint32(0x00000002) SERVICE_STOP_PENDING = uint32(0x00000003) SERVICE_RUNNING = uint32(0x00000004) ) type SERVICE_TABLE_ENTRY struct { name *uint16 proc uintptr } type SERVICE_STATUS struct { dwServiceType uint32 dwCurrentState uint32 dwControlsAccepted uint32 dwWin32ExitCode uint32 dwServiceSpecificExitCode uint32 dwCheckPoint uint32 dwWaitHint uint32 } var ( advapi32 = syscall.NewLazyDLL("Advapi32.dll") procStartServiceCtrlDispatcher = advapi32.NewProc("StartServiceCtrlDispatcherW") procRegisterServiceCtrlHandlerEx = advapi32.NewProc("RegisterServiceCtrlHandlerExW") procSetServiceStatus = advapi32.NewProc("SetServiceStatus") ) const ( CTRL_SHUTDOWN_EVENT = uint32(6) ) var ( kernel32 = syscall.NewLazyDLL("kernel32.dll") procSetConsoleCtrlHandler = kernel32.NewProc("SetConsoleCtrlHandler") ) func overrideCtrlShutdownEvent() error { n, _, err := procSetConsoleCtrlHandler.Call( 0, /* NULL */ 0 /* FALSE */) if n == 0 { return err } return nil } const svcName = "svctest" var stopped = make(chan bool) var ctrlShutdownEvent = make(chan bool) func log(text string) { filename := os.TempDir() + "\\" + "svctest.log" f, err := os.OpenFile(filename, os.O_APPEND|os.O_CREATE|os.O_WRONLY, 0600) if err != nil { panic(err) } defer f.Close() if _, err = f.WriteString(text); err != nil { panic(err) } } func ctrlHandlerEx(ctrlCode uint32, eventType uint32, eventData uintptr, context uintptr) uintptr { switch ctrlCode { case SERVICE_CONTROL_STOP: log(fmt.Sprintf("SERVICE_CONTROL_STOP\n")) stopped <- true case SERVICE_CONTROL_SHUTDOWN: log(fmt.Sprintf("SERVICE_CONTROL_SHUTDOWN\n")) stopped <- true } return 0 } func serviceMain(argc uint32, argv **uint16) uintptr { statusHandle, _, err := procRegisterServiceCtrlHandlerEx.Call(uintptr(unsafe.Pointer(syscall.StringToUTF16Ptr(svcName))), syscall.NewCallback(ctrlHandlerEx), uintptr(0)) if 0 == statusHandle { log(fmt.Sprintf("Error calling StartServiceCtrlDispatcherW: %v\n", err)) return 0 } status := SERVICE_STATUS{ dwServiceType: SERVICE_WIN32_OWN_PROCESS, dwControlsAccepted: (SERVICE_ACCEPT_STOP | SERVICE_ACCEPT_SHUTDOWN), dwCurrentState: SERVICE_START_PENDING, } ret, _, err := procSetServiceStatus.Call(statusHandle, uintptr(unsafe.Pointer(&status))) if ret == 0 { log(fmt.Sprintf("Error calling SetServiceStatus: %v\n", err)) } status.dwCurrentState = SERVICE_RUNNING ret, _, err = procSetServiceStatus.Call(statusHandle, uintptr(unsafe.Pointer(&status))) if ret == 0 { log(fmt.Sprintf("Error calling SetServiceStatus: %v\n", err)) } overrideCtrlShutdownEvent() <-stopped status.dwCurrentState = SERVICE_STOPPED ret, _, err = procSetServiceStatus.Call(statusHandle, uintptr(unsafe.Pointer(&status))) if ret == 0 { log(fmt.Sprintf("Error calling SetServiceStatus: %v\n", err)) } return 0 } func main() { t := []SERVICE_TABLE_ENTRY{ { name: syscall.StringToUTF16Ptr(svcName), proc: syscall.NewCallback(serviceMain), }, {name: nil, proc: 0}, } ret, _, err := procStartServiceCtrlDispatcher.Call(uintptr(unsafe.Pointer(&t[0]))) if ret == 0 { log(fmt.Sprintf("Error calling StartServiceCtrlDispatcherW: %v\n", err)) } } ```

It fails to log any SERVICE_CONTROL_SHUTDOWN

The only way to properly nuke a handler is to call SetConsoleCtrlHandler with the original pointer used as the first argument and then FALSE as the second argument.

I don't really know what a good solution is here. You could save the original callback pointer and then have it accessible to allow some users to remove it. This would include people that use user32/gdi or writes user services. It seems unreasonable to expect people to know this though. I would suggest that the runtime avoids touching SetConsoleCtrlHandler at all. There's no general way of mapping Windows voodoo to unix signals.

zx2c4 commented 4 years ago

I would suggest that the runtime avoids touching SetConsoleCtrlHandler at all. There's no general way of mapping Windows voodoo to unix signals.

I would have preferred that too, but perhaps that boat has already sailed, since we'd break API if we removed it.

However maybe there's a way to only call SetConsoleCtrlHandler lazily when an API that references it is used. I'll look into that. If not, then I'll see if I can play some games to recover that pointer.

zx2c4 commented 4 years ago

However maybe there's a way to only call SetConsoleCtrlHandler lazily when an API that references it is used.

Nevermind, it's always used, as sigsend is generally useful within the runtime. However, there's this tidbit:


        if sigsend(s) {
                return 1
        }
        if !islibrary && !isarchive {
                // Only exit the program if we don't have a DLL.
                // See https://golang.org/issues/35965.
                exit(2) // SIGINT, SIGTERM, etc
        }
        return 0

Is your code hitting that exit(2) there? Or the return 1? If the exit(2), maybe we just need to add an additional condition.

alexbrainman commented 4 years ago

I was thinking about adjusting runtime itself not to call SetConsoleCtrlHandler, if the process is running as service. What do you think?

Alex

alexbrainman commented 4 years ago

This broke when we added CTRL_CLOSE_EVENT, CTRL_LOGOFF_EVENT, CTRL_SHUTDOWN_EVENT to Go SetConsoleCtrlHandler handler. But maybe we should not be calling SetConsoleCtrlHandler from the service at all?

Alex

zx2c4 commented 4 years ago

I was thinking about adjusting runtime itself not to call SetConsoleCtrlHandler, if the process is running as service. What do you think?

Alex

That's what I initially tried doing. But this will break other things still where the signal handling is actually useful (crashes). I have another solution I'm working on that avoids bailing when in a service. I'm writing the code now to detect if we're running as a service. Tricky business.

alexbrainman commented 4 years ago

I'm writing the code now to detect if we're running as a service.

Are you looking for golang.org/x/sys/windows/svc.IsAnInteractiveSession ?

Alex

zx2c4 commented 4 years ago

Are you looking for golang.org/x/sys/windows/svc.IsAnInteractiveSession ?

Nope, not what we want, I don't think. The solution I'm writing is uglier, unfortunately. We'll see if it works. Halfway done now.

gopherbot commented 4 years ago

Change https://golang.org/cl/242280 mentions this issue: runtime: detect services in signal handler

zx2c4 commented 4 years ago

@yohan1234 Alright, try that latest CL. I haven't tested it for your case yet, but maybe it's correct.

JohanKnutzen commented 4 years ago

Thanks a lot for the help.

Is your code hitting that exit(2) there? Or the return 1? If the exit(2), maybe we just need to add an additional condition.

I believe that it is hitting the exit(2). I don't understand why the exit(2) exists though. That's not the proper way of using that function. If the handler does not handle the signal it should return 0. Even if you wish the program to terminate, it's preferred to exit gracefully via the entry point. Windows takes care of killing of the process automatically if required. This happens after a grace period of 10 seconds after the function returns. This is dependent upon the event.

To test your patch, do I need to rebuild go or can I just pull in a local runtime package with that patch somehow? Sorry I'm not sure what the steps are.

Also, if anyone is using gdi/user32 with hwnds, I think that the program still crashes needlessly rather than unwinding via hwnds. You would have to write a similar snippet to ignore events in case hwnds have been created. Just something to consider for the future.

I would say that lazily loading it when the signals package is pulled in would be a nice solution. Pulling in the signals package would impose some kind of lifecycle management anyways.

networkimprov commented 4 years ago

You can patch your local install in .../go/src/runtime/..., and simply build your app. The runtime and stdlib get linked into the executable.

Create a file with the patch in your installed .../go, and run git apply <patch>.

EDIT: you can download the patch file from the CL link above.

zx2c4 commented 4 years ago

To test your patch, do I need to rebuild go or can I just pull in a local runtime package with that patch somehow? Sorry I'm not sure what the steps are.

You have to rebuild, but I can build you a .zip or .tar or something if you tell me what GOOS/GOARCH you want. But you're probably better off doing this yourself. Clone the repo and run ./make.bash or make.bat.

Also, if anyone is using gdi/user32 with hwnds, I think that the program still crashes needlessly rather than unwinding via hwnds. You would have to write a similar snippet to ignore events in case hwnds have been created. Just something to consider for the future.

I'm not sure I grok what you mean here. I maintain a large gdi/user32 codebase and haven't encountered problems. What weird behavior do you suspect I should be seeing?

networkimprov commented 4 years ago

@zx2c4 the CL immediately above is in .../go/src/runtime/. That doesn't require a rebuild of Go.

@yohan1234, see my instructions above.

zx2c4 commented 4 years ago

@zx2c4 the CL immediately above is in .../go/src/runtime/. That doesn't require a rebuild of Go.

True.

JohanKnutzen commented 4 years ago

Thanks for the info @networkimprov Thanks for the patch @zx2c4

The patch wasn't compatible with 1.14.5 or 1.14.4 so I rebuilt golang with the patch applied from the golang repo using make.bat.

Unfortunately it did not work. Either isWindowsService() segfaults, returns the wrong answer or the program crashes somewhere else. I rebuilt the main.go program in my original post which logs the service event if it works.

I'm not sure I grok what you mean here. I maintain a large gdi/user32 codebase and haven't encountered problems. What weird behavior do you suspect I should be seeing?

As stated in the docs:

_If a console application loads the gdi32.dll or user32.dll library, the HandlerRoutine function that you specify when you call SetConsoleCtrlHandler does not get called for the CTRL_LOGOFF_EVENT and CTRL_SHUTDOWNEVENT events. The operating system recognizes processes that load gdi32.dll or user32.dll as Windows applications rather than console applications.

If you depend on the signals package to process SIGTERM via SetConsoleCtrlHandler, that's just gonna stop working if you load gdi32/user32

If you don't depend on those events, you wouldn't have any problems. Just saying that it's probably not possible to walk this line and implement SIGTERM on windows via the signals package. It's just going to function properly if you write console programs.

alexbrainman commented 4 years ago

Unfortunately it did not work.

I am not convinced your example program does not have bugs. For example, you should only call SetServiceStatus from your main service thread. You Go program is not doing that. Perhaps there are other issues with your program.

Alex

JohanKnutzen commented 4 years ago

@alexbrainman No I don't think so. The docs state otherwise: image

I've got a separate implementation using windows/svc that has the same problem: 40157

JohanKnutzen commented 4 years ago

@zx2c4 I can't rule out that I made mistakes when building go. I did this:

  1. cloned master
  2. applied your patch
  3. set CGO_ENABLED=0
  4. make.bat
  5. bin/go build main.go

I'm concerned that by disabling CGO I messed things up.

My arch is Windows 10 x64. If you send it to me via a link with a properly built golang repo, I can test building the example with it.

zx2c4 commented 4 years ago

@yohan1234 Sorry, there was a bug in the patchset I posted before. The updated one fixes the problem:

jason a. donenfeld@WINDOWSVM C:\Users\Jason A. Donenfeld>sc start svctest                

SERVICE_NAME: svctest
        TYPE               : 10  WIN32_OWN_PROCESS
        STATE              : 4  RUNNING
                                (STOPPABLE, NOT_PAUSABLE, ACCEPTS_SHUTDOWN)
        WIN32_EXIT_CODE    : 0  (0x0)
        SERVICE_EXIT_CODE  : 0  (0x0)
        CHECKPOINT         : 0x0
        WAIT_HINT          : 0x0
        PID                : 9796
        FLAGS              :

jason a. donenfeld@WINDOWSVM C:\Users\Jason A. Donenfeld>shutdown /s

[...]

jason a. donenfeld@WINDOWSVM C:\Users\Jason A. Donenfeld>type "\windows\Temp\svctest.log"
SERVICE_CONTROL_SHUTDOWN

This is using your exact code above. Therefore, I think https://go-review.googlesource.com/c/go/+/242280 solves the issue successfully.

JohanKnutzen commented 4 years ago

@zx2c4 Excellent! That solves my use case. Thank you!

JohanKnutzen commented 4 years ago

@zx2c4 - FYI regarding usage of user32/gdi32 within a golang exe: In another application I'm loading a dll that in turn loads user32 in order to use OpenClipboard(NULL). This application has no hwnds. If that golang application was dependent on the signals package for SIGTERM, it would never receive it upon shutdown. In order to fix this, golang would have to create a hidden window and wait for WM_QUERYENDSESSION/WM_ENDSESSION. I am currently implementing this for my application.

Here's the msdn bit regarding that: image

alexbrainman commented 4 years ago

@alexbrainman No I don't think so. The docs state otherwise

@yohan1234 you are correct. I am wrong.

Alex

gopherbot commented 4 years ago

Change https://golang.org/cl/243597 mentions this issue: runtime: do not explicitly exit on ctrl handler

zx2c4 commented 4 years ago

There are now two approaches to consider:

The first one is validated to be working. The second one I haven't looked into yet.

JohanKnutzen commented 4 years ago

To me, the second one makes more sense. I don't see the reason for exiting like that. It's better to under-specify behavior when dealing with Windows.

zx2c4 commented 4 years ago

To me, the second one makes more sense. I don't see the reason for exiting like that. It's better to under-specify behavior when dealing with Windows.

Right, in theory I agree with you. In practice I'm a bit more hesitant to remove the exit(2) because I don't have documentation from the original author detailing considerations, and I haven't worked through enough edge cases to reunderstand the entire problem space myself. But indeed on inspection it looks like the second patch is much much better.

What I suspect we should do is merge patch (1) now, for 1.15 and 1.14, because it fixes the bug. Then, we can merge (2) at the very beginning of the 1.16 cycle and see if it breaks anything.

@alexbrainman What do you think of that plan?

networkimprov commented 4 years ago

I don't think the runtime folks can accept a patch on the eve of a release, and since there's a workaround, this probably isn't a backport candidate.

alexbrainman commented 4 years ago

Right, in theory I agree with you. In practice I'm a bit more hesitant to remove the exit(2) because I don't have documentation from the original author detailing considerations, and I haven't worked through enough edge cases to reunderstand the entire problem space myself. But indeed on inspection it looks like the second patch is much much better.

I agree. I prefer to remove exit(2) too.

What I suspect we should do is merge patch (1) now, for 1.15 and 1.14, because it fixes the bug. Then, we can merge (2) at the very beginning of the 1.16 cycle and see if it breaks anything.

I agree. This is the safest thing to do. We can revert patch (1) before applying patch (2), so it is nice and clean change.

We just need agreement from someone from Go Team. Maybe @ianlancetaylor or @aclements

Thank you.

Alex

ianlancetaylor commented 4 years ago

I think that https://golang.org/cl/242280 would be OK for 1.15 if this seems important enough. The CL seems safe.

I'm not sure why we would backport this change. As far as I know this case has been broken since the beginning.

networkimprov commented 4 years ago

@ianlancetaylor I don't think we should change the Windows runtime this late. There is a workaround. https://github.com/golang/go/issues/40167#issuecomment-657808287

alexbrainman commented 4 years ago

I think that https://golang.org/cl/242280 would be OK for 1.15 if this seems important enough. The CL seems safe.

Sounds good. Then, once 1.15 is released we will revert it, and apply https://golang.org/cl/243597 instead.

Should we submit CL 242280 now? Or should we wait for 1.15 to be released, and then backport CL 242280 to 1.15 release branch?

This makes shutdown event working again for every single Go service. The service executes exit code in there (like saving data to disk and similar). So none of that code run. I think it is important enough. But others might disagree.

As far as I know this case has been broken since the beginning.

I don't think so. It was working fine in 1.13, See https://github.com/golang/go/issues/40074#issuecomment-659952395 for details.

There is a workaround. #40167 (comment)

I did not investigate it properly. Maybe it works, maybe it does not.

Alex

zx2c4 commented 4 years ago

I think that https://golang.org/cl/242280 would be OK for 1.15 if this seems important enough. The CL seems safe.

Okay, sounds good. Please +2 the CL and then I'll submit it now?

I'm not sure why we would backport this change. As far as I know this case has been broken since the beginning.

Services have been broken since 1.13, actually. My own code has been subtly broken, in fact, relying on some related WTS_SESSION_LOGOFF behavior elsewhere to mostly mask the problem almost, but not always. So this is definitely something we should consider backporting into 1.14.

networkimprov commented 4 years ago

The CL seems safe.

If the barely-tested isWindowsService() yields a false positive, won't that prevent keyboard Ctrl-C from exiting the program? That would cause widespread breakage, including my app.

zx2c4 commented 4 years ago

The CL seems safe.

If the barely-tested isWindowsService() yields a false positive, won't that prevent keyboard Ctrl-C from exiting the program? That would cause widespread breakage, including my app.

If by "barely-tested" you mean "the technique that is used by the .NET framework", then maybe? See https://github.com/dotnet/extensions/blob/f4066026ca06984b07e90e61a6390ac38152ba93/src/Hosting/WindowsServices/src/WindowsServiceHelpers.cs#L26-L31 for details. I doubt this will cause any form of "widespread breakage"; you'll need to substantiate your hyperbole a little bit.

Anyway, see if you can construct a false positive. Have fun injecting into windows sessionId==0. :)

networkimprov commented 4 years ago

The technique may be sound, but isn't this a reimplementation of it?

I guess you answered my Q about potentially preventing exit on Ctrl-C in the affirmative.

zx2c4 commented 4 years ago

The technique may be sound, but isn't this a reimplementation of it?

Okay so your objection is, "if your code has bugs, then it has bugs." I guess in that case one must agree with you.

I guess you answered my Q about potentially preventing exit on Ctrl-C in the affirmative.

I'm not sure I follow what you've concluded or your logic. Maybe you can clarify.

zx2c4 commented 4 years ago

I guess you answered my Q about potentially preventing exit on Ctrl-C in the affirmative.

I'm not sure I follow what you've concluded or your logic. Maybe you can clarify.

Oh, I re-read the whole thread and I understand what you're wondering.

If there's a false positive, this will fall back to the return 0 case, which will then call the default win32 handler, which will wind up exiting the program. So there shouldn't be any effect there. Actually, for 1.16, we're planning on removing the whole clause of special cases (library, archive, service), and always using the default handler which should, in theory, do the right thing. So regression potential here seems pretty low.

Plus: the onus is still on you - show me a real program that can generate a false positive. SessionId==0 might be a bit trixy.

networkimprov commented 4 years ago

That a false positive resorts to the default win32 handler is helpful. However that might alter the exit code.

Re exit(2) what value does the default win32 handler use on exit? I believe Go also uses 2 in panic scenarios; that might be the rationale.

Erm, typically the onus is on the patch author to provide tests in the CL that exercise the new code :-)

zx2c4 commented 4 years ago

Erm, typically the onus is on the patch author to provide tests in the CL that exercise the new code :-)

We're talking about false positives, not about tests. At least that's the concern you brought up. In that case, the "test" for false positives would be to exhaust every possible variety of process states available on NT, including undocumented ones, and paths and conditions going to that and... tackling something very likely turing complete? Not gonna happen in a go test case. In other words, if you're concerned about the mere possibility of false positives (despite Microsoft deploying the same technique in .NET), explicit test cases are not what you want; those generally help with verifying known cases.

However, this issue has a pretty good test case, from @yohan1234, which we've used to verify that the CL fixes the issue. If you want to contribute additional tests for your use cases, be my guest. If you have a real technical objection, I'm happy to discuss it.

It sounds like you're trying to pull various willy nilly punches in your usual way to block this fix for whatever weird reason. If you think there's a problem in the code, or that sessionId==0 is not a sufficient condition, please pipe up and contribute something technical. Otherwise I'm going to move on and wait to hear from Go code contributors and Go leadership on whether any further changes are required.

networkimprov commented 4 years ago

I'm not opposed to the patch; I didn't say I was. I'm opposed to dropping code into the runtime on the eve of a release. What is the hurry? How does attacking me make this an urgent fix? (And please stop being abusive when you hear disagreement; this is not LKML.)

A previous late-cycle Windows runtime patch produced #35447, and others.

The above is a policy matter, not a technical one. Re the latter, I offered some insight on the reason for exit(2), which you haven't addressed.

networkimprov commented 4 years ago

And re tests, I meant that the CL currently has none. It would be nice to know that Win-service apps work in subsequent releases.

Finding false positives obviously requires trials in a wide variety of Windows configurations, which we can't get for 1.15.