Closed mattoaellis closed 3 years ago
Thanks Matt -though I am puzzled why changing the exchange routines should give different answers - I would have expected the outcome to be identical?
Hi Richard, The GPU and CPU are slightly different as it is starting from a thermal state and so I guess the different RNG for CUDA or CPU give some small variation in the starting configuration. The CUSPARSE exchange routines give a similar result if you compare to the cusp results in the previous pull request here https://github.com/richard-evans/vampire/pull/20 (note that CUDA is on the right in this comment thread but left in the old one). They show similar shapes but I must have had a different system size as the number of pixels is different. Maybe we can try a different test starting from the same thermalised ensemble generated by the CPU?
Hi Matt, OK awesome - I'll do the check and integrate everything - for some reason I thought the comparison was between GPU versions. Thanks!
Hi Richard,
I noticed another issue inquiring about the status of the CUDA upgrade and I was wondering if you had had time to look over this pull request?
Cheers, Matt
Thanks for the prompt Matt - now all merged! Cheers, Richard
Hi Matt, so I think there is a bug in the exchange calculation. I have fixed some compiler bugs in the cuda branch, so this now works out of the box with CUDA10, but if you set the exchange to zero then it complains about argument 5 of the call to cusparseCreateCsr():
** On entry to cusparseCreateCsr() parameter number 5 (csrColInd) had an illegal value Failed to initialise sparse matrix descriptor!
The simulation also gives incorrect results for 100K atoms at RT, since the magnetization goes to zero. The code segment itself seem innocuous enough, so I am a little bit puzzled. I also have a project student to work on the CUDA version this year, and hopefully add in MPI functionality too. If you have a moment, could you possibly take a quick peek to see if you can see any obvious issue? If not not to worry, I'll have another look as soon as possible.
Cheers,
Richard
Hi Richard,
That's odd. I'll take a quick look. I checked it with the tensor exchange so it must be some separate part of the vector or scalar exchange.
Cheers, Matt
I have swapped out the older cusp sparse matrix vector routines for the cusparse generalised interface ones. It compiles and I get results matching the previous cusp implementation following the DMI test. There are still errors with the GPU stats calculation but with the
gpu:calculate-statistics-on-cpu=true
flag the magnetisation appears close to the CPU version.