Memory requirement formula correction

kWeissenow commented 4 years ago

While reducing the alignment sizes of my current dataset in order to be able to compute couplings on the GPU, I noticed a large discrepancy between results from the formula in the README and the actual RAM needed when running CCMpred.

I know that CCMpred is no longer actively maintained, but in order to help fellow researches running into the same issue, here is the corrected formula based on the calculation in the source code (ccmpred.c, lines 437-441): Padded: 4 (4 (L L 32 21 + L 20) + N L 2 + N L 32 + N) + 2 N L Unpadded: 4 (4 (L L 21 21 + L 20) + N L 2 + N L 21 + N) + 2 N L

The internal size_t mem_needed is however only used for the output part, the actual allocation happens separately for a variety of different memory blocks. I'll do some further testing with samples calculated to barely fit into GPU memory to see if the CUDA allocations are equivalent.

kWeissenow commented 4 years ago

Apparently, the actual GPU memory needed is still larger than indicated, leading to a crash with CUDA error 2 (out of memory).

Found 1 CUDA devices, using device #0: Tesla V100-SXM2-16GB
Total GPU RAM:     16,914,055,168
Free GPU RAM:      16,475,422,720
Needed GPU RAM:    16,475,401,388 �
Reweighted 538462 sequences with threshold 0.8 to Beff=226100 weight mean=0.4199, min=8.95656e-05, max=1

Will optimize 20389525 32-bit variables

iter    eval    f(x)            �x�             �g�             step
CUDA error No. 2 in [...]/CCMpred/lib/libconjugrad/src/conjugrad_cuda.c at line 185

When further reducing alignment sizes so memory consumption stops being a problem, large MSAs still cause crashes with CUDA error 77 (illegal memory access) as shown in the example below:

Found 1 CUDA devices, using device #0: Tesla V100-SXM2-16GB
Total GPU RAM:     16,914,055,168
Free GPU RAM:      16,475,422,720
Needed GPU RAM:    12,562,797,518 �
Reweighted 307029 sequences with threshold 0.8 to Beff=153460 weight mean=0.499823, min=0.00118765, max=1

Will optimize 33843029 32-bit variables

iter    eval    f(x)            �x�             �g�             step
CUDA error No. 77 in [...]/CCMpred/src/evaluate_cuda_kernels.cu at line 590

Since apparently this has not been a common occurrence in the past, I assume the very large alignment is causing the issue. I'll try to investigate and will report back if I find the problem in the CUDA kernels.

jhschwartz commented 2 years ago

Hi, I wonder if this is related to #34? Just opened it and I'm curious if you found a solution.

soedinglab / CCMpred

Memory requirement formula correction #27