mumax / 3

GPU-accelerated micromagnetic simulator
Other
447 stars 150 forks source link

Different result on Titan Xp (windows) and gtx 780 Ti (Linux) - MuMax3.10β #251

Closed OndrejW closed 4 years ago

OndrejW commented 4 years ago

Cheers,

the same .mx3 file gives me different results if I run it on gtx 780 Ti which is hosted on Linux server (Ubuntu 18.04) or on Windows 10 based PC with Titan Xp. On Titan Xp everything seems normal and simulations end as is expected, but on the same .mx3 file on gtx 780 Ti (Linux) we get the error after few picoseconds:

panic: Time step too small, check if parameters are sensible

The .mx3 file is here: test.txt

Thanks in advance for any advice and your time

ddkn commented 4 years ago

Our group has a semi-related problem. We have one machine running Ubuntu (18.04) with CUDA 10.1 running a 1060 that is able to run code on MuMax 3.10beta2. However, we have two other machines all running MuMax 3.10beta2:

  1. Debian buster, CUDA10.0
  2. Windows 7, CUDA 10.0, MuMax 3.10beta2
  3. Windows 7, CUDA 10.1, MuMax 3.10beta2
  4. Windows 7, CUDA 8.0, MuMax 3.9.1 (This setup was upgraded to (2) and (3))

The Debian system has a NVIDIA GeForce 1660, and the Windows machine has a NVIDIA GeForce 1050Ti. The PC ram is 16GB on these systems.

None of these other systems can run the same code and crash immediately entering the relaxing phase, and crash on NaN issues (golang errors). I will attach the errors later, I am not at the computer at the moment.

At the moment, I too am at a loss on how to proceed.

JeroenMulkers commented 4 years ago

It is possible to get different results on different machines. It is very hard (if not impossible) to ensure perfect reproducibility between gpu architectures, driver versions, etc. If the input script is reasonable (from a numerical point of view) these differences should be negligible though.

The attached script, however, has a cellsize which is too large (larger than the exchange length). This makes it hard to know if the mentioned issue is caused by a bug in mumax3 or by a non-sensible setting in the input script (which seems to me to be the case here). Therefor, I will close this issue, but feel free to re-open if similar issues occur for another input script.