davidADSP / Generative_Deep_Learning_2nd_Edition

The official code repository for the second edition of the O'Reilly book Generative Deep Learning: Teaching Machines to Paint, Write, Compose and Play.
https://www.oreilly.com/library/view/generative-deep-learning/9781098134174/
Apache License 2.0
1.12k stars 428 forks source link

docker-compose.gpu.yml error upon running #35

Open young2code opened 7 months ago

young2code commented 7 months ago

I'm on Windows with NVIDA 4070 and seeing the below error when trying to launch docker with docker-compose.gpu.yml. Any idea how to resolve this?

PS C:\Projects\GenerativeDeepLearning> docker compose -f docker-compose.gpu.yml up [+] Running 2/0 ✔ Network generativedeeplearning_default Created 0.0s ✔ Container generativedeeplearning-app-1 Created 0.1s Attaching to app-1 Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 2, stdout: , stderr: fatal error: unexpected signal during runtime execution [signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x7fcd133aad54]

runtime stack: runtime.throw({0x5286a1?, 0x6d?}) /usr/local/go/src/runtime/panic.go:992 +0x71 runtime.sigpanic() /usr/local/go/src/runtime/signal_unix.go:802 +0x389

goroutine 1 [syscall]: runtime.cgocall(0x4f48d0, 0xc00017d958) /usr/local/go/src/runtime/cgocall.go:157 +0x5c fp=0xc00017d930 sp=0xc00017d8f8 pc=0x40523c github.com/NVIDIA/go-nvml/pkg/dl._Cfunc_dlopen(0x9c8820, 0x1) _cgo_gotypes.go:113 +0x4d fp=0xc00017d958 sp=0xc00017d930 pc=0x4ee78d github.com/NVIDIA/go-nvml/pkg/dl.(DynamicLibrary).Open(0xc00017da30) /go/src/nvidia-container-toolkit/vendor/github.com/NVIDIA/go-nvml/pkg/dl/dl.go:55 +0x74 fp=0xc00017d9d0 sp=0xc00017d958 pc=0x4ee994 gitlab.com/nvidia/cloud-native/go-nvlib/pkg/nvlib/info.(infolib).HasNvml(0xc00012c1e0?) /go/src/nvidia-container-toolkit/vendor/gitlab.com/nvidia/cloud-native/go-nvlib/pkg/nvlib/info/info.go:47 +0x85 fp=0xc00017da68 sp=0xc00017d9d0 pc=0x4eed85 github.com/NVIDIA/nvidia-container-toolkit/internal/info.ResolveAutoMode({0x54f5c8, 0x6333e0}, {0xc000138157?, 0x52974f?}) /go/src/nvidia-container-toolkit/internal/info/auto.go:42 +0x1bb fp=0xc00017db18 sp=0xc00017da68 pc=0x4ef53b main.doPrestart() /go/src/nvidia-container-toolkit/cmd/nvidia-container-runtime-hook/main.go:77 +0xdd fp=0xc00017df08 sp=0xc00017db18 pc=0x4f2e7d main.main() /go/src/nvidia-container-toolkit/cmd/nvidia-container-runtime-hook/main.go:176 +0x11e fp=0xc00017df80 sp=0xc00017df08 pc=0x4f43de runtime.main() /usr/local/go/src/runtime/proc.go:250 +0x212 fp=0xc00017dfe0 sp=0xc00017df80 pc=0x4368d2 runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc00017dfe8 sp=0xc00017dfe0 pc=0x460981: unknown

svrc commented 5 months ago

Try it in a WSL2 instance & terminal window? This looks like the underlying NVIDIA library is panicking. Docker Desktop might not be able to reach the NVIDIA card unless it's running in WSL2 mode?