mumax / 3

GPU-accelerated micromagnetic simulator
Other
457 stars 151 forks source link

No output on certain Tesla GPU cards #158

Closed Dmytro-Apalkov closed 6 years ago

Dmytro-Apalkov commented 6 years ago

I am testing the mumax code and running some simple STT switching cases. I have noticed that the simulation on certain GPU cards (e.g. P40, P100) does not give any output or error (table.txt has only one line for t=0), whereas the same simulation runs fine on other GPU cards (e.g. M40). In all cases, CUDA is 7.5 and I am using the precompiled library.

Any suggestions?

godsic commented 6 years ago

Please attach the log file or the output you see in the console.

Dmytro-Apalkov commented 6 years ago

Here is the content of log.txt. Here I am using T!=0 case but I have similar behavior for T=0.

//output directory: stt.out/ sizeX := 40e-9 sizeY := sizeX sizeZ := 1.4e-9 N := 32 setgridsize(N, N, 1) setcellsize(sizeX/N, sizeY/N, sizeZ) setGeom(circle(sizeX)) Msat = 800e3 Aex = 10e-12 Ku1 = 12e6 alpha = 0.01 m = uniform(0, 0, 1) lambda = 1 Pol = 0.5669 epsilonprime = 0 Temp = 300 fixdt = 2e-14 setsolver(2) ThermSeed(1) fixedlayer = vector(0.01, 0, 1) Jtot := -0.0014 area := sizeX sizeY pi / 4 jc := Jtot / area J = vector(0, 0, jc) autosave(m, 100e-12) tableautosave(10e-12) run(1e-9)

Dmytro-Apalkov commented 6 years ago

If it helps, the above script works fine on K40m, K80, M40 GPUs but does not work on P100 and P40. Thanks!

kkingstoun commented 6 years ago

Sorry about the offtop, but could you run banchmark on your GPU's and paste here the results?

https://github.com/mumax/3/tree/master/bench

godsic commented 6 years ago

@Dmytro-Apalkov Thanks, could you please also post mumax3 output that appears in console and also the output of the nvidia-smi command? It should tell us which GPU driver / kernel versions are used upon mumax3 invocation.

Dmytro-Apalkov commented 6 years ago

Sure, kkingstoun, will paste the results to the comparison shortly...

Dmytro-Apalkov commented 6 years ago

@godsic . The output of the mumax is below. The execution time seems OK, the code seems to be running (or doing something) but there is no output. OUTPUT of MUMAX: mumax3 -gpu 6 test.mx //mumax 3.9.3 linux_amd64 go1.7.1 (gc) //CUDA 9000 Tesla P40(24445MB) cc6.1, using CC53 PTX //(c) Arne Vansteenkiste, Dynamat LAB, Ghent University, Belgium //This is free software without any warranty. See license.txt //output directory: test.out/ //starting GUI at http://127.0.0.1:35367 sizeX := 40e-9 sizeY := sizeX sizeZ := 1.4e-9 N := 32 setgridsize(N, N, 1) setcellsize(sizeX/N, sizeY/N, sizeZ) setGeom(circle(sizeX)) // Initializing geometry 3 % // Initializing geometry 100 % Msat = 800e3 Aex = 10e-12 Ku1 = 12e6 alpha = 0.01 m = uniform(0, 0, 1) lambda = 1 Pol = 0.5669 epsilonprime = 0 Temp = 300 fixdt = 2e-14 setsolver(2) ThermSeed(15) fixedlayer = vector(0.01, 0, 1) Jtot := -0.0014 area := sizeX sizeY pi / 4 jc := Jtot / area J = vector(0, 0, jc) autosave(m, 100e-12) tableautosave(10e-12) run(1e-9) //Not using kernel cache (-cache="")

OUTPUT of NVIDIA-SMI: nvidia-smi Fri Feb 9 07:35:26 2018 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 384.81 Driver Version: 384.81 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 Tesla P40 Off | 00000000:08:00.0 Off | Off | | N/A 28C P0 50W / 250W | 1255MiB / 24445MiB | 0% Default | +-------------------------------+----------------------+----------------------+

godsic commented 6 years ago

@Dmytro-Apalkov Thanks! Will you be able to try mumax3 binary linked against CUDA 9.1? If so, I will provide you with the link to download it.

Dmytro-Apalkov commented 6 years ago

@godsic 9.1 is not really. I have environment set up for 8.0, 8.5, 9.0.

Dmytro-Apalkov commented 6 years ago

@kkingstoun , I have added benchmark on 3 Tesla cards.

kkingstoun commented 6 years ago

@Dmytro-Apalkov Thank You! You have nice toys ;)

Dmytro-Apalkov commented 6 years ago

@godsic One of the failing runs on P100 gave this error: panic: CURAND_STATUS_LAUNCH_FAILURE

goroutine 1 [running, locked to thread]: panic(0x8606e0, 0xc420144288) /home/arne/bin/go/src/runtime/panic.go:500 +0x1a1 github.com/mumax/3/cuda/curand.Generator.GenerateNormal(0x2b6405fafc30, 0x1090e40a000, 0x400, 0x3f80000000000000) /home/arne/src/github.com/mumax/3/cuda/curand/generator.go:41 +0xf8 github.com/mumax/3/engine.(thermField).update(0xe66980) /home/arne/src/github.com/mumax/3/engine/temperature.go:98 +0x23f github.com/mumax/3/engine.(thermField).AddTo(0xe66980, 0xc42007c8c0) /home/arne/src/github.com/mumax/3/engine/temperature.go:50 +0x50 github.com/mumax/3/engine.SetEffectiveField(0xc42007c8c0) /home/arne/src/github.com/mumax/3/engine/effectivefield.go:17 +0x94 github.com/mumax/3/engine.SetLLTorque(0xc42007c8c0) /home/arne/src/github.com/mumax/3/engine/torque.go:48 +0x2f github.com/mumax/3/engine.SetTorque(0xc42007c8c0) /home/arne/src/github.com/mumax/3/engine/torque.go:41 +0x2b github.com/mumax/3/engine.torqueFn(0xc42007c8c0) /home/arne/src/github.com/mumax/3/engine/run.go:93 +0x2b github.com/mumax/3/engine.(Heun).Step(0xf49110) /home/arne/src/github.com/mumax/3/engine/heun.go:26 +0x119 github.com/mumax/3/engine.step(0x843001) /home/arne/src/github.com/mumax/3/engine/run.go:196 +0x39 github.com/mumax/3/engine.runWhile(0xc420071990, 0xc4211cb801) /home/arne/src/github.com/mumax/3/engine/run.go:181 +0x94 github.com/mumax/3/engine.RunWhile(0xc420071990) /home/arne/src/github.com/mumax/3/engine/run.go:172 +0x3c github.com/mumax/3/engine.Run(0x3e112e0be826d695) /home/arne/src/github.com/mumax/3/engine/run.go:158 +0x57 reflect.Value.call(0x845e80, 0xa69ed0, 0x13, 0x8e89d5, 0x4, 0xc4211cb900, 0x1, 0x1, 0x13, 0x845e80, ...) /home/arne/bin/go/src/reflect/value.go:434 +0x5c8 reflect.Value.Call(0x845e80, 0xa69ed0, 0x13, 0xc4211cb900, 0x1, 0x1, 0xa, 0x0, 0x0) /home/arne/bin/go/src/reflect/value.go:302 +0xa4 github.com/mumax/3/script.(call).Eval(0xc420142f90, 0x1, 0x1) /home/arne/src/github.com/mumax/3/script/call.go:61 +0x1c7 github.com/mumax/3/engine.EvalFile(0xc4201428d0) /home/arne/src/github.com/mumax/3/engine/script.go:102 +0x13e main.runFileAndServe(0x7fffee9f8266, 0x7) /home/arne/src/github.com/mumax/3/cmd/mumax3/main.go:144 +0x151 main.main() /home/arne/src/github.com/mumax/3/cmd/mumax3/main.go:89 +0x1ce

godsic commented 6 years ago

@Dmytro-Apalkov Indeed the error you see is common if GPU drivers and (or) versions of CUDA libraries or mumax3 CUDA kernels are not appropriate for the particular GPU. Here you can download a mumax3 binary compiled from the master branch and linked against CUDA9.0.

Dmytro-Apalkov commented 6 years ago

@godsic , Thank you! I will test it out.

Dmytro-Apalkov commented 6 years ago

@godsic Sorry for being away for quite some time. I was busy with something else. Anyway, I have just tested the version that you compiled for CUDA9.0. It works just fine on all the cards. Thank you!