Closed ifelsefi closed 5 years ago
Hi Douglas,
"Since" is inclusive of later versions, so RELION-3 will certainly use NVIDIA GPUs too (in fact, it should be faster than RELION-2 - although we're still tuning a couple things more during the beta)
Cheers,
Erik
On Thu, Aug 2, 2018 at 2:31 PM Douglas Duckworth notifications@github.com wrote:
Hi
Our users would like us to try out Relion 3 https://bitbucket.org/scheres/relion-3.0_beta.git.
I am seeing the following statement:
`Parts of the cryo-EM processing pipeline can be very computationally demanding, and in some cases special hardware can be used to make these faster. There are two such cases at the moment;
Since RELION-2: Use one or more PGUs, or graphics cards. RELION only supports CUDA-capable GPUs of compute capabilty 3.5 or higher.
Since RELION-3: Use the vectorized version. RELION only supports GCC and ICC 2018.3 or later. There are more benefits than speed; the accelearated versions also have a decreased memory footprint. Details about how to enable either of these options is listed below.`
Does this mean that Relion 3 only supports CPU but not GPU?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/3dem/relion/issues/378, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFpejkj-9N_heyjzr3tXeTnRdoVsmj2ks5uMvEGgaJpZM4VsMaD .
-- Erik Lindahl erik.lindahl@dbb.su.se Professor of Biophysics, Dept. Biochemistry & Biophysics, Stockholm University Science for Life Laboratory, Box 1031, 17121 Solna, Sweden
Thank you!
We are buying more GPUs so want to make sure that's not a waste.
Will try Beta3 now...
Hi
After reading this article it seems that GPU and CPU Relion 3 will branch entirely while AVX-512 instructions mean CPU has compelling advantage considering box size limitations and price of professional grade GPUs.
However can you tell me if Relion uses Cuda for FFT acceleration as I have heard it does not on Relion listserv. I am asking since V-100 Tensor Cores are great at matrix multiplication thus FFT. Moreover, does Relion leverage Cuda Unifed Memory Architecture which would allow oversubscribing GPU memory using system memory.
With regards to Cuda-FFT (CUFFT):
We use them when possible. Because a full-size output each iteration will perform a full-size FFT, maximization-associated FFTs do not use Cuda, because any large-box classification/refinement would fail immediately (iteration 1). We could try a Cuda-FFT and revert to non-Cuda when it's too big, but this feature has not made it into relion yet. For any other FFT (as those low-passed (cropped) to the current resolution in expectation ops), relion uses Cuda-FFTs.
With regards to unifying memory;
Unifying the memory of multiple GPUs may benefit the final iterations of large-box refinements, but the gain has not merited the level of effort necessary to make this happen. We asses that using the new CPU-acceleration, reverting to non-GPU execution whenever necessary is a more reasonable solution. Unfortunately this is still according to the "run on GPU, then crash, then continue on CPUs manually"-model. I shouldn't call it a model though - we simply haven't had time enough to prioritize this.
I hope to be able to make this automatic by 3.1 - Relion won't die with out-of-mem, it will just run the accelerated CPU-code instead. This might under-utilize the hardware though - nobody likes a user that needlessly hogs the GPU-resources. Unification of memory seems like a nice solution, but it's a massive amount of work that might end up slowing down the overall application, and will mess with the current memory flow. It's also a big, blind bet for what nvidia does next; we cannot justify large re-designs unless we are convinced that it will continue to be supported and efficient. Hence our reservations against it.
Note on tensor cores;
They are great for neural nets. Haven't heard that they would be good specifically for FFTs, although it makes sense they could do it. Source?
Hi
Our users would like us to try out Relion 3.
I am seeing the following statement:
Does this mean that Relion 3 only supports CPU but not GPU?
If so does this mean we can see better performance on AVX512 CPU than V100? Can we see benchmarks?