golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.31k stars 17.7k forks source link

runtime: using AVX-512 instruction without supporting CPUID flag(s) on MacOS hangs the Go runtime #42649

Open vsivsi opened 4 years ago

vsivsi commented 4 years ago

What version of Go are you using (go version)?

$ go version
go version go1.15.5 darwin/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

MacOS 10.15.7

go env Output
$ go env

GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/vsi/Library/Caches/go-build"
GOENV="/Users/vsi/Library/Application Support/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/vsi/go/pkg/mod"
GONOPROXY="github.com/vsivsi"
GONOSUMDB="github.com/vsivsi"
GOOS="darwin"
GOPATH="/Users/vsi/go"
GOPRIVATE="github.com/vsivsi"
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/Cellar/go/1.15.5/libexec"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/Cellar/go/1.15.5/libexec/pkg/tool/darwin_amd64"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/kp/kjdr0ytx5z9djnq4ysl15x0h0000gn/T/go-build367056703=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

Attempt to use Intel AVX-512 VPOPCNT family AVX-512 instructions in go assembler.

What did you expect to see?

Assembly code using these instruction should run properly on processors supporting them, and should generate a UD fault (SIGILL) and terminate when invoked on a CPU without support.

What did you see instead?

Go runtime hangs forever with 100% CPU utilization upon executing a VPOPCNT(B/W/D/Q) instruction on hardware that doesn't support it. Tested running on a MacPro (2019) with 2.7 GHz 24-Core Intel Xeon W CPU (Xeon W-3265M)

$ sysctl machdep.cpu.leaf7_features
machdep.cpu.leaf7_features: RDWRFSGS TSC_THREAD_OFFSET BMI1 AVX2 FDPEO SMEP BMI2 ERMS INVPCID PQM FPU_CSDS MPX PQE AVX512F AVX512DQ RDSEED ADX SMAP CLFSOPT CLWB IPT AVX512CD AVX512BW AVX512VL PKU AVX512VNNI MDCLEAR IBRS STIBP L1DF ACAPMSR SSBD

Note, this processor does not include AVX512_BITALG or AVX512_VPOPCNTDQ, which are required for VPOPCNT(B/W) and VPOPCNT(D/Q) respectively. For a summary of the VPOPCNT support matrix, see: https://github.com/HJLebbink/asm-dude/wiki/VPOPCNT

The Intel processor documentation says that attempting to run such AVX512 instructions when the supporting feature CPUID flags are not set should result in raising a #UD exception. As expected, directly executing the amd64 UD2 instruction causes the go runtime to abort with SIGILL: illegal instruction. But when unsupported, these AVX512 instructions cause the runtime to hang in a tight loop of some kind, which doesn't seem to be consistent or correct behavior.

Here is a dump from a process sample of the hung go runtime process resulting from the repro below.

/usr/bin/sample Output
Sampling process 40325 for 3 seconds with 1 millisecond of run time between samples
Sampling completed, processing symbols...
Analysis of sampling vpopcntw (pid 40325) every 1 millisecond
Process:         vpopcntw [40325]
Path:            /Users/USER/*/vpopcntw
Load Address:    0x1000000
Identifier:      vpopcntw
Version:         ???
Code Type:       X86-64
Parent Process:  zsh [37987]

Date/Time:       2020-11-16 15:50:45.036 -0800
Launch Time:     2020-11-16 15:50:29.269 -0800
OS Version:      Mac OS X 10.15.7 (19H15)
Report Version:  7
Analysis Tool:   /usr/bin/sample

Physical footprint:         1732K
Physical footprint (peak):  1732K
----

Call graph:
    2819 Thread_355890   DispatchQueue_1: com.apple.main-thread  (serial)
    + 2816 ???  (in )  [0xc00009e7d0]
    + ! 2816 main.popcnt  (in vpopcntw) + 0  [0x105c8e0]
    + 3 runtime.main  (in vpopcntw) + 521  [0x102ee69]
    +   3 0x0
    +     2 _sigtramp  (in libsystem_platform.dylib) + 0  [0x7fff685985e0]
    +     1 _sigtramp  (in libsystem_platform.dylib) + 29  [0x7fff685985fd]
    +       1 runtime.sigtramp  (in vpopcntw) + 51  [0x105aeb3]
    +         1 ???  (in )  [0xc000000480]
    +           1 runtime.setg  (in vpopcntw) + 5  [0x1059405]
    2819 Thread_355892
    + 2819 thread_start  (in libsystem_pthread.dylib) + 15  [0x7fff6859fb8b]
    +   2819 runtime.mstart_stub  (in vpopcntw) + 46  [0x105b14e]
    +     2819 runtime.mstart  (in vpopcntw) + 102  [0x10316a6]
    +       2819 runtime.mstart1  (in vpopcntw) + 200  [0x1031788]
    +         2818 runtime.sysmon  (in vpopcntw) + 173  [0x10399ed]
    +         ! 2818 runtime.usleep  (in vpopcntw) + 49  [0x1047dd1]
    +         !   2818 runtime.asmcgocall  (in vpopcntw) + 173  [0x10593ed]
    +         !     2818 runtime.usleep_trampoline  (in vpopcntw) + 11  [0x105b02b]
    +         !       2818 usleep  (in libsystem_c.dylib) + 53  [0x7fff68466de4]
    +         !         2818 nanosleep  (in libsystem_c.dylib) + 196  [0x7fff68466eea]
    +         !           2816 __semwait_signal  (in libsystem_kernel.dylib) + 10  [0x7fff684e3756]
    +         !           1 cerror  (in libsystem_kernel.dylib) + 20  [0x7fff684e2241]
    +         !           : 1 cerror_nocancel  (in libsystem_kernel.dylib) + 0  [0x7fff684e1629]
    +         !           1 cerror  (in libsystem_kernel.dylib) + 0  [0x7fff684e222d]
    +         1 runtime.sysmon  (in vpopcntw) + 433  [0x1039af1]
    +           1 runtime.retake  (in vpopcntw) + 518  [0x103a0e6]
    +             1 runtime.preemptone  (in vpopcntw) + 165  [0x103a2a5]
    +               1 runtime.preemptM  (in vpopcntw) + 135  [0x103f307]
    +                 1 runtime.pthread_kill  (in vpopcntw) + 49  [0x1047b11]
    +                   1 runtime.asmcgocall  (in vpopcntw) + 173  [0x10593ed]
    +                     1 runtime.pthread_kill_trampoline  (in vpopcntw) + 16  [0x105b330]
    +                       1 pthread_kill  (in libsystem_pthread.dylib) + 179  [0x7fff685a3d65]
    2819 Thread_355893
    + 2819 runtime.mcall  (in vpopcntw) + 91  [0x105799b]
    +   2819 runtime.park_m  (in vpopcntw) + 157  [0x103571d]
    +     2819 runtime.schedule  (in vpopcntw) + 110  [0x1034f0e]
    +       2819 runtime.startlockedm  (in vpopcntw) + 133  [0x1033785]
    +         2819 runtime.stopm  (in vpopcntw) + 197  [0x1032d25]
    +           2819 runtime.notesleep  (in vpopcntw) + 231  [0x1009387]
    +             2819 runtime.semasleep  (in vpopcntw) + 141  [0x102972d]
    +               2819 runtime.pthread_cond_wait  (in vpopcntw) + 57  [0x1048459]
    +                 2819 runtime.asmcgocall  (in vpopcntw) + 173  [0x10593ed]
    +                   2819 runtime.pthread_cond_wait_trampoline  (in vpopcntw) + 16  [0x105b2b0]
    +                     2819 _pthread_cond_wait  (in libsystem_pthread.dylib) + 698  [0x7fff685a4425]
    +                       2819 __psynch_cvwait  (in libsystem_kernel.dylib) + 10  [0x7fff684e3882]
    2819 Thread_355894
    + 2819 runtime.mcall  (in vpopcntw) + 91  [0x105799b]
    +   2819 runtime.park_m  (in vpopcntw) + 157  [0x103571d]
    +     2819 runtime.schedule  (in vpopcntw) + 110  [0x1034f0e]
    +       2819 runtime.startlockedm  (in vpopcntw) + 133  [0x1033785]
    +         2819 runtime.stopm  (in vpopcntw) + 197  [0x1032d25]
    +           2819 runtime.notesleep  (in vpopcntw) + 231  [0x1009387]
    +             2819 runtime.semasleep  (in vpopcntw) + 141  [0x102972d]
    +               2819 runtime.pthread_cond_wait  (in vpopcntw) + 57  [0x1048459]
    +                 2819 runtime.asmcgocall  (in vpopcntw) + 173  [0x10593ed]
    +                   2819 runtime.pthread_cond_wait_trampoline  (in vpopcntw) + 16  [0x105b2b0]
    +                     2819 _pthread_cond_wait  (in libsystem_pthread.dylib) + 698  [0x7fff685a4425]
    +                       2819 __psynch_cvwait  (in libsystem_kernel.dylib) + 10  [0x7fff684e3882]
    2819 Thread_355895
      2819 thread_start  (in libsystem_pthread.dylib) + 15  [0x7fff6859fb8b]
        2819 runtime.mstart_stub  (in vpopcntw) + 46  [0x105b14e]
          2819 runtime.mstart  (in vpopcntw) + 102  [0x10316a6]
            2819 runtime.mstart1  (in vpopcntw) + 147  [0x1031753]
              2819 runtime.schedule  (in vpopcntw) + 727  [0x1035177]
                2819 runtime.findrunnable  (in vpopcntw) + 2687  [0x10344ff]
                  2819 runtime.stopm  (in vpopcntw) + 197  [0x1032d25]
                    2819 runtime.notesleep  (in vpopcntw) + 231  [0x1009387]
                      2819 runtime.semasleep  (in vpopcntw) + 141  [0x102972d]
                        2819 runtime.pthread_cond_wait  (in vpopcntw) + 57  [0x1048459]
                          2819 runtime.asmcgocall  (in vpopcntw) + 173  [0x10593ed]
                            2819 runtime.pthread_cond_wait_trampoline  (in vpopcntw) + 16  [0x105b2b0]
                              2819 _pthread_cond_wait  (in libsystem_pthread.dylib) + 698  [0x7fff685a4425]
                                2819 __psynch_cvwait  (in libsystem_kernel.dylib) + 10  [0x7fff684e3882]

Total number in stack (recursive counted multiple, when >=5):
        5       runtime.asmcgocall  (in vpopcntw) + 173  [0x10593ed]

Sort by top of stack, same collapsed (when >= 5):
        __psynch_cvwait  (in libsystem_kernel.dylib)        8457
        __semwait_signal  (in libsystem_kernel.dylib)        2816
        main.popcnt  (in vpopcntw)        2816

Binary Images:
         0x1000000 -          0x10c61ee +vpopcntw (???) /Users/*/vpopcntw
         0xd52f000 -          0xd5c0f47  dyld (750.6) <1D318D60-C9B0-3511-BE9C-82AFD2EF930D> /usr/lib/dyld
    0x7fff2a0cc000 -     0x7fff2a0ccfff  com.apple.Accelerate (1.11 - Accelerate 1.11) <4F9977AE-DBDB-3A16-A536-AC1F9938DCDD> /System/Library/Frameworks/Accelerate.framework/Versions/A/Accelerate
    0x7fff2a0e4000 -     0x7fff2a73afff  com.apple.vImage (8.1 - 524.2.1)  /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vImage.framework/Versions/A/vImage
    0x7fff2a73b000 -     0x7fff2a9a2ff7  libBLAS.dylib (1303.60.1)  /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
    0x7fff2a9a3000 -     0x7fff2ae76fef  libBNNS.dylib (144.100.2) <99C61C48-B14C-3DA6-8C31-6BF72DA0A3A9> /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBNNS.dylib
    0x7fff2ae77000 -     0x7fff2b212fff  libLAPACK.dylib (1303.60.1) <5E3E3867-50C3-3E6A-9A2E-007CE77A4641> /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libLAPACK.dylib
    0x7fff2b213000 -     0x7fff2b228fec  libLinearAlgebra.dylib (1303.60.1) <3D433800-0099-33E0-8C81-15F83247B2C9> /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libLinearAlgebra.dylib
    0x7fff2b229000 -     0x7fff2b22eff3  libQuadrature.dylib (7) <371F36A7-B12F-363E-8955-F24F7C2048F6> /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libQuadrature.dylib
    0x7fff2b22f000 -     0x7fff2b29ffff  libSparse.dylib (103)  /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libSparse.dylib
    0x7fff2b2a0000 -     0x7fff2b2b2fef  libSparseBLAS.dylib (1303.60.1)  /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libSparseBLAS.dylib
    0x7fff2b2b3000 -     0x7fff2b48afd7  libvDSP.dylib (735.140.1)  /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libvDSP.dylib
    0x7fff2b48b000 -     0x7fff2b54dfef  libvMisc.dylib (735.140.1) <3601FDE3-B142-398D-987D-8151A51F0A96> /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libvMisc.dylib
    0x7fff2b54e000 -     0x7fff2b54efff  com.apple.Accelerate.vecLib (3.11 - vecLib 3.11)  /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/vecLib
    0x7fff2ccb4000 -     0x7fff2d043ffa  com.apple.CFNetwork (1128.0.1 - 1128.0.1) <07F9CA9C-B954-3EA0-A710-3122BFF9F057> /System/Library/Frameworks/CFNetwork.framework/Versions/A/CFNetwork
    0x7fff2e445000 -     0x7fff2e8c4feb  com.apple.CoreFoundation (6.9 - 1677.104)  /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation
    0x7fff2f82d000 -     0x7fff2f82dfff  com.apple.CoreServices (1069.24 - 1069.24)  /System/Library/Frameworks/CoreServices.framework/Versions/A/CoreServices
    0x7fff2f82e000 -     0x7fff2f8b3fff  com.apple.AE (838.1 - 838.1) <2E5FD5AE-8A7F-353F-9BD1-0241F3586181> /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/AE.framework/Versions/A/AE
    0x7fff2f8b4000 -     0x7fff2fb95ff7  com.apple.CoreServices.CarbonCore (1217 - 1217)  /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/CarbonCore.framework/Versions/A/CarbonCore
    0x7fff2fb96000 -     0x7fff2fbe3ffd  com.apple.DictionaryServices (1.2 - 323.6) <26B70C82-25BC-353A-858F-945B14C803A2> /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/DictionaryServices.framework/Versions/A/DictionaryServices
    0x7fff2fbe4000 -     0x7fff2fbecff7  com.apple.CoreServices.FSEvents (1268.100.1 - 1268.100.1)  /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/FSEvents.framework/Versions/A/FSEvents
    0x7fff2fbed000 -     0x7fff2fe27ff6  com.apple.LaunchServices (1069.24 - 1069.24) <9A5359D9-9148-3B18-B868-56A9DA5FB60C> /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/LaunchServices.framework/Versions/A/LaunchServices
    0x7fff2fe28000 -     0x7fff2fec0ff1  com.apple.Metadata (10.7.0 - 2076.7) <0973F7E5-D58C-3574-A3CE-4F12CAC2D4C7> /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/Metadata.framework/Versions/A/Metadata
    0x7fff2fec1000 -     0x7fff2feeefff  com.apple.CoreServices.OSServices (1069.24 - 1069.24) <0E4F48AD-402C-3E9D-9CA9-6DD9479B28F9> /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/OSServices.framework/Versions/A/OSServices
    0x7fff2feef000 -     0x7fff2ff56fff  com.apple.SearchKit (1.4.1 - 1.4.1) <2C5E1D85-E8B1-3DC5-91B9-E3EDB48E9369> /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/SearchKit.framework/Versions/A/SearchKit
    0x7fff2ff57000 -     0x7fff2ff7bff5  com.apple.coreservices.SharedFileList (131.4 - 131.4) <02DE0D56-E371-3EF5-9BC1-FA435451B412> /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/SharedFileList.framework/Versions/A/SharedFileList
    0x7fff307c1000 -     0x7fff307c7fff  com.apple.DiskArbitration (2.7 - 2.7) <0BBBB6A6-604D-368B-9943-50B8CE75D51D> /System/Library/Frameworks/DiskArbitration.framework/Versions/A/DiskArbitration
    0x7fff30b02000 -     0x7fff30ec7fff  com.apple.Foundation (6.9 - 1677.104) <7C69F845-F651-3193-8262-5938010EC67D> /System/Library/Frameworks/Foundation.framework/Versions/C/Foundation
    0x7fff3123b000 -     0x7fff312dfff3  com.apple.framework.IOKit (2.0.2 - 1726.140.1) <14223387-6F81-3976-8605-4BC2F253A93E> /System/Library/Frameworks/IOKit.framework/Versions/A/IOKit
    0x7fff34de8000 -     0x7fff34df4ffe  com.apple.NetFS (6.0 - 4.0) <4415F027-D36D-3B9C-96BA-AD22B44A4722> /System/Library/Frameworks/NetFS.framework/Versions/A/NetFS
    0x7fff379d7000 -     0x7fff379f3fff  com.apple.CFOpenDirectory (10.15 - 220.40.1) <7E6C88EB-3DD9-32B0-81FC-179552834FA9> /System/Library/Frameworks/OpenDirectory.framework/Versions/A/Frameworks/CFOpenDirectory.framework/Versions/A/CFOpenDirectory
    0x7fff379f4000 -     0x7fff379ffffd  com.apple.OpenDirectory (10.15 - 220.40.1) <4A92D8D8-A9E5-3A9C-942F-28576F6BCDF5> /System/Library/Frameworks/OpenDirectory.framework/Versions/A/OpenDirectory
    0x7fff3ad9c000 -     0x7fff3b0e5ff1  com.apple.security (7.0 - 59306.140.5)  /System/Library/Frameworks/Security.framework/Versions/A/Security
    0x7fff3b0e6000 -     0x7fff3b16effb  com.apple.securityfoundation (6.0 - 55236.60.1) <7C69DF47-4017-3DF2-B55B-712B481654CB> /System/Library/Frameworks/SecurityFoundation.framework/Versions/A/SecurityFoundation
    0x7fff3b19d000 -     0x7fff3b1a1ff8  com.apple.xpc.ServiceManagement (1.0 - 1) <2C62956C-F2D4-3EB0-86C7-EAA06331621A> /System/Library/Frameworks/ServiceManagement.framework/Versions/A/ServiceManagement
    0x7fff3be4d000 -     0x7fff3bec7ff7  com.apple.SystemConfiguration (1.19 - 1.19) <84F9B3BB-F7AF-3B7C-8CD0-D3C22D19619F> /System/Library/Frameworks/SystemConfiguration.framework/Versions/A/SystemConfiguration
    0x7fff3fe37000 -     0x7fff3fefcfe7  com.apple.APFS (1412.141.1 - 1412.141.1)  /System/Library/PrivateFrameworks/APFS.framework/Versions/A/APFS
    0x7fff41c07000 -     0x7fff41c16fd7  com.apple.AppleFSCompression (119.100.1 - 1.0) <466ABD77-2E52-36D1-8E39-77AE2CC61611> /System/Library/PrivateFrameworks/AppleFSCompression.framework/Versions/A/AppleFSCompression
    0x7fff433d7000 -     0x7fff433e0ff7  com.apple.coreservices.BackgroundTaskManagement (1.0 - 104)  /System/Library/PrivateFrameworks/BackgroundTaskManagement.framework/Versions/A/BackgroundTaskManagement
    0x7fff461e8000 -     0x7fff461f8ff3  com.apple.CoreEmoji (1.0 - 107.1) <7C2B3259-083B-31B8-BCDB-1BA360529936> /System/Library/PrivateFrameworks/CoreEmoji.framework/Versions/A/CoreEmoji
    0x7fff46838000 -     0x7fff468a2ff0  com.apple.CoreNLP (1.0 - 213)  /System/Library/PrivateFrameworks/CoreNLP.framework/Versions/A/CoreNLP
    0x7fff4771d000 -     0x7fff4774bffd  com.apple.CSStore (1069.24 - 1069.24)  /System/Library/PrivateFrameworks/CoreServicesStore.framework/Versions/A/CoreServicesStore
    0x7fff539a9000 -     0x7fff53a77ffd  com.apple.LanguageModeling (1.0 - 215.1)  /System/Library/PrivateFrameworks/LanguageModeling.framework/Versions/A/LanguageModeling
    0x7fff53a78000 -     0x7fff53ac0fff  com.apple.Lexicon-framework (1.0 - 72) <41F208B9-8255-3EC7-9673-FE0925D071D3> /System/Library/PrivateFrameworks/Lexicon.framework/Versions/A/Lexicon
    0x7fff53ac7000 -     0x7fff53accff3  com.apple.LinguisticData (1.0 - 353.18) <3B92F249-4602-325F-984B-D2DE61EEE4E1> /System/Library/PrivateFrameworks/LinguisticData.framework/Versions/A/LinguisticData
    0x7fff54e35000 -     0x7fff54e81fff  com.apple.spotlight.metadata.utilities (1.0 - 2076.7) <0237323B-EC78-3FBF-9FC7-5A1FE2B5CE25> /System/Library/PrivateFrameworks/MetadataUtilities.framework/Versions/A/MetadataUtilities
    0x7fff55938000 -     0x7fff55942fff  com.apple.NetAuth (6.2 - 6.2)  /System/Library/PrivateFrameworks/NetAuth.framework/Versions/A/NetAuth
    0x7fff5ebce000 -     0x7fff5ebdeff3  com.apple.TCC (1.0 - 1) <017AB27D-6821-303A-8FD2-6DAC795CC7AA> /System/Library/PrivateFrameworks/TCC.framework/Versions/A/TCC
    0x7fff622c1000 -     0x7fff622c3ff3  com.apple.loginsupport (1.0 - 1) <12F77885-27DC-3837-9CE9-A25EBA75F833> /System/Library/PrivateFrameworks/login.framework/Versions/A/Frameworks/loginsupport.framework/Versions/A/loginsupport
    0x7fff64de1000 -     0x7fff64e15fff  libCRFSuite.dylib (48) <5E5DE3CB-30DD-34DC-AEF8-FE8536A85E96> /usr/lib/libCRFSuite.dylib
    0x7fff64e18000 -     0x7fff64e22fff  libChineseTokenizer.dylib (34) <7F0DA183-1796-315A-B44A-2C234C7C50BE> /usr/lib/libChineseTokenizer.dylib
    0x7fff64eae000 -     0x7fff64eb0ff7  libDiagnosticMessagesClient.dylib (112)  /usr/lib/libDiagnosticMessagesClient.dylib
    0x7fff65384000 -     0x7fff65385fff  libSystem.B.dylib (1281.100.1) <0A6C8BA1-30FD-3D10-83FD-FF29E221AFFE> /usr/lib/libSystem.B.dylib
    0x7fff65412000 -     0x7fff65413fff  libThaiTokenizer.dylib (3) <4F4ADE99-0D09-3223-B7C0-C407AB6DE8DC> /usr/lib/libThaiTokenizer.dylib
    0x7fff6542b000 -     0x7fff65441fff  libapple_nghttp2.dylib (1.39.2) <07FEC48A-87CF-32A3-8194-FA70B361713A> /usr/lib/libapple_nghttp2.dylib
    0x7fff65476000 -     0x7fff654e8ff7  libarchive.2.dylib (72.140.1)  /usr/lib/libarchive.2.dylib
    0x7fff65586000 -     0x7fff65586ff3  libauto.dylib (187)  /usr/lib/libauto.dylib
    0x7fff6564c000 -     0x7fff6565cffb  libbsm.0.dylib (60.100.1) <00BFFB9A-2FFE-3C24-896A-251BC61917FD> /usr/lib/libbsm.0.dylib
    0x7fff6565d000 -     0x7fff65669fff  libbz2.1.0.dylib (44) <14CC4988-B6D4-3879-AFC2-9A0DDC6388DE> /usr/lib/libbz2.1.0.dylib
    0x7fff6566a000 -     0x7fff656bcfff  libc++.1.dylib (902.1) <59A8239F-C28A-3B59-B8FA-11340DC85EDC> /usr/lib/libc++.1.dylib
    0x7fff656bd000 -     0x7fff656d2ffb  libc++abi.dylib (902)  /usr/lib/libc++abi.dylib
    0x7fff656d3000 -     0x7fff656d3fff  libcharset.1.dylib (59) <72447768-9244-39AB-8E79-2FA14EC0AD33> /usr/lib/libcharset.1.dylib
    0x7fff656d4000 -     0x7fff656e5fff  libcmph.dylib (8)  /usr/lib/libcmph.dylib
    0x7fff656e6000 -     0x7fff656fdfd7  libcompression.dylib (87) <64C91066-586D-38C0-A2F3-3E60A940F859> /usr/lib/libcompression.dylib
    0x7fff659d7000 -     0x7fff659edff7  libcoretls.dylib (167) <770A5B96-936E-34E3-B006-B1CEC299B5A5> /usr/lib/libcoretls.dylib
    0x7fff659ee000 -     0x7fff659effff  libcoretls_cfhelpers.dylib (167) <940BF370-FD0C-30A8-AA05-FF48DA44FA4C> /usr/lib/libcoretls_cfhelpers.dylib
    0x7fff66115000 -     0x7fff66115fff  libenergytrace.dylib (21) <162DFCC0-8F48-3DD0-914F-FA8653E27B26> /usr/lib/libenergytrace.dylib
    0x7fff6613c000 -     0x7fff6613efff  libfakelink.dylib (149.1) <36146CB2-E6A5-37BB-9EE8-1B4034D8F3AD> /usr/lib/libfakelink.dylib
    0x7fff6614d000 -     0x7fff66152fff  libgermantok.dylib (24)  /usr/lib/libgermantok.dylib
    0x7fff6615d000 -     0x7fff6624dfff  libiconv.2.dylib (59) <18311A67-E4EF-3CC7-95B3-C0EDEE3A282F> /usr/lib/libiconv.2.dylib
    0x7fff6624e000 -     0x7fff664a5fff  libicucore.A.dylib (64260.0.1) <8AC2CB07-E7E0-340D-A849-186FA1F27251> /usr/lib/libicucore.A.dylib
    0x7fff664bf000 -     0x7fff664c0fff  liblangid.dylib (133) <30CFC08C-EF36-3CF5-8AEA-C1CB070306B7> /usr/lib/liblangid.dylib
    0x7fff664c1000 -     0x7fff664d9ff3  liblzma.5.dylib (16)  /usr/lib/liblzma.5.dylib
    0x7fff664f1000 -     0x7fff66598ff7  libmecab.dylib (883.11) <0D5BFD01-D4A7-3C8D-AA69-C329C1A69792> /usr/lib/libmecab.dylib
    0x7fff66599000 -     0x7fff667fbff1  libmecabra.dylib (883.11)  /usr/lib/libmecabra.dylib
    0x7fff66cc7000 -     0x7fff67143ff5  libnetwork.dylib (1880.120.4)  /usr/lib/libnetwork.dylib
    0x7fff671e4000 -     0x7fff67217fde  libobjc.A.dylib (787.1) <6DF81160-5E7F-3E31-AA1E-C875E3B98AF6> /usr/lib/libobjc.A.dylib
    0x7fff6722a000 -     0x7fff6722efff  libpam.2.dylib (25.100.1) <0502F395-8EE6-3D2A-9239-06FD5622E19E> /usr/lib/libpam.2.dylib
    0x7fff67231000 -     0x7fff67267ff7  libpcap.A.dylib (89.120.1)  /usr/lib/libpcap.A.dylib
    0x7fff6735f000 -     0x7fff67549ff7  libsqlite3.dylib (308.5) <35A2BD9F-4E33-30DE-A994-4AB585AC3AFE> /usr/lib/libsqlite3.dylib
    0x7fff6779a000 -     0x7fff6779dffb  libutil.dylib (57)  /usr/lib/libutil.dylib
    0x7fff6779e000 -     0x7fff677abff7  libxar.1.dylib (425.2)  /usr/lib/libxar.1.dylib
    0x7fff677b1000 -     0x7fff67893fff  libxml2.2.dylib (33.5)  /usr/lib/libxml2.2.dylib
    0x7fff67897000 -     0x7fff678bffff  libxslt.1.dylib (16.9) <34A45627-DA5B-37D2-9609-65B425E0010A> /usr/lib/libxslt.1.dylib
    0x7fff678c0000 -     0x7fff678d2ff3  libz.1.dylib (76) <793D9643-CD83-3AAC-8B96-88D548FAB620> /usr/lib/libz.1.dylib
    0x7fff68181000 -     0x7fff68186ff3  libcache.dylib (83)  /usr/lib/system/libcache.dylib
    0x7fff68187000 -     0x7fff68192fff  libcommonCrypto.dylib (60165.120.1)  /usr/lib/system/libcommonCrypto.dylib
    0x7fff68193000 -     0x7fff6819afff  libcompiler_rt.dylib (101.2) <49B8F644-5705-3F16-BBE0-6FFF9B17C36E> /usr/lib/system/libcompiler_rt.dylib
    0x7fff6819b000 -     0x7fff681a4ff7  libcopyfile.dylib (166.40.1) <3C481225-21E7-370A-A30E-0CCFDD64A92C> /usr/lib/system/libcopyfile.dylib
    0x7fff681a5000 -     0x7fff68237fdb  libcorecrypto.dylib (866.140.1) <60567BF8-80FA-359A-B2F3-A3BAEFB288FD> /usr/lib/system/libcorecrypto.dylib
    0x7fff68344000 -     0x7fff68384ff0  libdispatch.dylib (1173.100.2)  /usr/lib/system/libdispatch.dylib
    0x7fff68385000 -     0x7fff683bbfff  libdyld.dylib (750.6) <789A18C2-8AC7-3C88-813D-CD674376585D> /usr/lib/system/libdyld.dylib
    0x7fff683bc000 -     0x7fff683bcffb  libkeymgr.dylib (30)  /usr/lib/system/libkeymgr.dylib
    0x7fff683bd000 -     0x7fff683c9ff3  libkxld.dylib (6153.141.2.2) <30AACC57-2314-3863-94B2-64AB3E002B35> /usr/lib/system/libkxld.dylib
    0x7fff683ca000 -     0x7fff683caff7  liblaunch.dylib (1738.140.1)  /usr/lib/system/liblaunch.dylib
    0x7fff683cb000 -     0x7fff683d0ff7  libmacho.dylib (959.0.1)  /usr/lib/system/libmacho.dylib
    0x7fff683d1000 -     0x7fff683d3ff3  libquarantine.dylib (110.40.3)  /usr/lib/system/libquarantine.dylib
    0x7fff683d4000 -     0x7fff683d5ff7  libremovefile.dylib (48) <7C7EFC79-BD24-33EF-B073-06AED234593E> /usr/lib/system/libremovefile.dylib
    0x7fff683d6000 -     0x7fff683edff3  libsystem_asl.dylib (377.60.2) <1563EE02-0657-3B78-99BE-A947C24122EF> /usr/lib/system/libsystem_asl.dylib
    0x7fff683ee000 -     0x7fff683eeff7  libsystem_blocks.dylib (74) <0D53847E-AF5F-3ACF-B51F-A15DEA4DEC58> /usr/lib/system/libsystem_blocks.dylib
    0x7fff683ef000 -     0x7fff68476fff  libsystem_c.dylib (1353.100.2)  /usr/lib/system/libsystem_c.dylib
    0x7fff68477000 -     0x7fff6847affb  libsystem_configuration.dylib (1061.141.1) <0EE84C33-64FD-372B-974A-AF7A136F2068> /usr/lib/system/libsystem_configuration.dylib
    0x7fff6847b000 -     0x7fff6847efff  libsystem_coreservices.dylib (114)  /usr/lib/system/libsystem_coreservices.dylib
    0x7fff6847f000 -     0x7fff68487fff  libsystem_darwin.dylib (1353.100.2) <5B12B5DB-3F30-37C1-8ECC-49A66B1F2864> /usr/lib/system/libsystem_darwin.dylib
    0x7fff68488000 -     0x7fff6848ffff  libsystem_dnssd.dylib (1096.100.3)  /usr/lib/system/libsystem_dnssd.dylib
    0x7fff68490000 -     0x7fff68491ffb  libsystem_featureflags.dylib (17) <29FD922A-EC2C-3F25-BCCC-B58D716E60EC> /usr/lib/system/libsystem_featureflags.dylib
    0x7fff68492000 -     0x7fff684dfff7  libsystem_info.dylib (538) <8A321605-5480-330B-AF9E-64E65DE61747> /usr/lib/system/libsystem_info.dylib
    0x7fff684e0000 -     0x7fff6850cff7  libsystem_kernel.dylib (6153.141.2.2) <5CDBBC06-6CA6-3432-9FDA-681047866F3E> /usr/lib/system/libsystem_kernel.dylib
    0x7fff6850d000 -     0x7fff68554fff  libsystem_m.dylib (3178) <00F331F1-0D09-39B3-8736-1FE90E64E903> /usr/lib/system/libsystem_m.dylib
    0x7fff68555000 -     0x7fff6857cfff  libsystem_malloc.dylib (283.100.6) <8549294E-4C53-36EB-99F3-584A7393D8D5> /usr/lib/system/libsystem_malloc.dylib
    0x7fff6857d000 -     0x7fff6858affb  libsystem_networkextension.dylib (1095.140.2)  /usr/lib/system/libsystem_networkextension.dylib
    0x7fff6858b000 -     0x7fff68594ff7  libsystem_notify.dylib (241.100.2)  /usr/lib/system/libsystem_notify.dylib
    0x7fff68595000 -     0x7fff6859dfef  libsystem_platform.dylib (220.100.1) <009A7C1F-313A-318E-B9F2-30F4C06FEA5C> /usr/lib/system/libsystem_platform.dylib
    0x7fff6859e000 -     0x7fff685a8fff  libsystem_pthread.dylib (416.100.3) <62CB1A98-0B8F-31E7-A02B-A1139927F61D> /usr/lib/system/libsystem_pthread.dylib
    0x7fff685a9000 -     0x7fff685adff3  libsystem_sandbox.dylib (1217.141.2) <051C4018-4345-3034-AC98-6DE42FB8273B> /usr/lib/system/libsystem_sandbox.dylib
    0x7fff685ae000 -     0x7fff685b0fff  libsystem_secinit.dylib (62.100.2)  /usr/lib/system/libsystem_secinit.dylib
    0x7fff685b1000 -     0x7fff685b8ffb  libsystem_symptoms.dylib (1238.120.1) <5820A2AF-CE72-3AB3-ABCC-273A3419FB55> /usr/lib/system/libsystem_symptoms.dylib
    0x7fff685b9000 -     0x7fff685cfff2  libsystem_trace.dylib (1147.120) <04B47629-847B-3D74-8ABE-C05EF9DEEFE4> /usr/lib/system/libsystem_trace.dylib
    0x7fff685d1000 -     0x7fff685d6ff7  libunwind.dylib (35.4) <42B7B509-BAFE-365B-893A-72414C92F5BF> /usr/lib/system/libunwind.dylib
    0x7fff685d7000 -     0x7fff6860cffe  libxpc.dylib (1738.140.1) <3E243A41-030F-38E3-9FD2-7B38C66C35B1> /usr/lib/system/libxpc.dylib
Sample analysis of process 40325 written to file /dev/stdout

Minimal Reproduction

The code below when compiled with go build and then executed, should immediately hang with 100% processor utilization on a single process thread, when run on a CPU missing either AVX512_BITALG or AVX512_VPOPCNTDQ CPUID feature flags (which I believe at the time of this writing is all Apple Macs Edited to Add: except the top of the line 10th Gen Core (Ice Lake) powered 13" Macbook Pros).

main.go

package main

func popcnt()  // assembly stub

func main() {
    popcnt()
}

popcnt_amd64.s

// +build !gccgo,!purego

#include "textflag.h"

// func popcnt() 
TEXT ·popcnt(SB), NOSPLIT, $0-0

// This instruction causes the Go runtime to immediately hang at 100% CPU utilization
VPOPCNTW Z1, Z0   // Requires AVX512_BITALG

// Or equivalently, so does this one
VPOPCNTQ Z1, Z0  // Requires AVX512_VPOPCNTDQ

RET
randall77 commented 4 years ago

Strange.

I get the same behavior from C.

main.c:

void foo();
int main(int argc, char *argv[]) {
  foo();
}

main.s:

    .globl _foo
_foo:
    vpopcntw    %zmm1, %zmm0
    ret

This program also hangs. Compile with gcc main.c main.s, run with ./a.out. So I think this is an OSX bug, not a Go bug.

Nothing obvious when run under a debugger. The debugger runs it forever, and every time I interrupt it it is at the vpopcntw instruction.

randall77 commented 4 years ago

The same C code generates an illegal instruction fault on Linux, so chances are it isn't the chip (although my mac and linux boxes aren't exactly the same chip.)

vsivsi commented 4 years ago

The Darwin kernel has a semi-spooky 2-tier AVX512 thread "promotion" mechanism that involves trapping AVX512 instruction faults, changing thread status to support AVX512, and then rerunning the offending instruction. In theory this scheme should only happen once per process thread upon encountering the first AVX512 instruction. The purpose is to avoid the large additional thread state required for AVX512 (around 2KB) when it is not needed. I would assume that it would only try this promotion procedure once per thread, such that if the AVX512 instruction causing the fault still isn't supported after enabling AVX512 in the thread state, that fault should revert to the process. But I'm way out over my skis on this kind of stuff... Here's the Darwin reference:

https://github.com/apple/darwin-xnu/blob/0a798f6738bc1db01281fc08ae024145e84df927/osfmk/i386/fpu.c#L176

randall77 commented 4 years ago

I've submitted a bug to Apple, reference number FB8902463. Their bug reporting tool isn't really public, so I'll report back here if they say anything (which they usually don't, they just silently ignore them).

vsivsi commented 3 years ago

Related to this issue, it appears that on MacOS, the golang.org/x/sys/cpu package does not properly recognize Macs that support AVX512 instructions, due to the Darwin "AVX512 thread promotion" mechanism I mentioned above. Specifically this code incorrectly assumes that OS disabled XSAVE AVX512 thread state can't be changed.

https://github.com/golang/sys/blob/master/cpu/cpu_x86.go#L90

I'm working on a separate issue for this that I'll link here as well.