mumax / 3

GPU-accelerated micromagnetic simulator
Other
450 stars 150 forks source link

Cuda_Launch_Error for large system sizes #217

Open JLauzier opened 5 years ago

JLauzier commented 5 years ago

We recently upgraded to a 1070ti. I've just been testing it using the hysteresis.mx3 on the example page, but with a large system. For large system sizes (> SetGridsize(2048, 2048, 1) ) it seems to run into issues with Cuda, either Cuda_Launch_Error or a Sync error. It will error out, typically after a few steps in the hysteresis loop using minimize() or relax(). The system is large, but well within the memory limits of the card (~6Gb or less out of 8Gb, so not full even with ~1Gb of windows overhead). I've attached a picture with the error message.

Passing the --sync command seems to fix the issue in my limited testing, but it's listed as a debug feature only so I'm not sure i should be doing that.

Is this intended?

*Note: I'm not planning on actually running anything so large. I was just trying to see what the card could handle, but i wasn't expecting any errors.

error

godsic commented 5 years ago

@JLauzier Could you please check if https://raw.githubusercontent.com/mumax/3/master/bench/bench.mx3 fails as well?

JLauzier commented 5 years ago

Hi Mykola,

Bench.mx3 passes for sizes up to (4096,4096,1). It fails at (8192,8192,1) after calculating the demag kernel as expected because that system size is too large for the card (8GB), at least with memory requirements of the heun solver.

I modified bench to run smaller grid sizes > 4096 (albeit not powers of 2) and 1000 steps, and it seems completely fine up to ~4296, which is just under 8GB. And failing at 4396, again because of the memory limit.

I'll attach the output of bench.mx3 just in case it's useful, but it seems the heun solver seems stable.

benchmark.txt mykola1 mykola2 mykola3 mykola4 benchmarkmod.txt benchmarkmodoutput.txt