willemneal / mcgpu

Automatically exported from code.google.com/p/mcgpu
1 stars 0 forks source link

MC-GPU Hanging Randomly After Mulitple, Sequential Runs #16

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
While running multiple scans sequentially using MC-GPU, the program will 
randomly hang when trying to initialize the voxels. 

I have let the program sit at this phase overnight with no change, but when I 
reset the computer the program runs right away.

This is happening on a Dell Precision T7500 using a Quadro 6000 for computing 
and Quadro 600 for display running Ubuntu 12.04 and using Cuda v5.0 and MC-GPU 
v1.3. 

I checked the nvidia-smi output and even though in the terminal the program is 
running and the output recognizes the mc-gpu job is there, the Quadro 6000 
never leaves the P12 idle state to start computing like it would on a normal 
run. I don't believe it is a memory leak or overheating because the card shows 
almost all memory free and all reasonable temperatures.

Has anyone encountered similar problems of the program hanging until a restart?

-Dave

Original issue reported on code.google.com by DAPDunke...@gmail.com on 10 Jul 2013 at 9:43

GoogleCodeExporter commented 9 years ago
I have not found this problem but it may be related to the way you are "running 
multiple scans sequentially". 
How do you do this?

It could be that the code or the GPU driver do not deallocate the GPU memory 
correctly, maybe?
I have run several simulations one after the other using a shell script and 
this worked well.

Original comment by andre...@gmail.com on 10 Jul 2013 at 11:31

GoogleCodeExporter commented 9 years ago

Original comment by andre...@gmail.com on 18 Jul 2013 at 3:13

GoogleCodeExporter commented 9 years ago
I was running a python script to run multiple scans and found that if I just 
clear the graphics card using nvidia-smi it runs smoothing. I think you may be 
right on the card not deallocating memory properly.

Original comment by DAPDunke...@gmail.com on 18 Jul 2013 at 5:59