Closed PavelFalkovskiy closed 7 years ago
Hi @PavelFalkovskiy ,
Could you monitor your GPU when you do this? (you can use GPU-Z for that purpose) It could look like a memory issue, where you don't have enough VRAM for the data your loading to your GPU. It happened to me when using very large pyramids, or large images, or multiple metrics (or all three at the time...) with a desktop GTX980 (8GB VRAM is quickly spent when you get used to 32GB RAM...)
I did use ElastiX with OpenCL (and it works well), but it was long ago and a lot of things changed since then. But I'll try to find my standard parameters (as soon as I got time for it) and see if you can find something there.
Just one thing that looks odd to me: (OpenCLMovingGenericImagePyramidUseOpenCL "true")
seems redundant with (MovingImagePyramid "OpenCLMovingGenericImagePyramid")
. I think I only used the later for moving and fixed image pyramids. But that may be the new way of doing it.
Hi Fabien,
Thanks for your response. Happy to hear that you had Elastix with OpenCL working in the past! I am quite curious about what type of registration did you run and what order of speed up did you experience if you remember?
Regarding my problem, last week I ran gpu-z and the peak memory usage was around 500 mb... the images that I try to register are cardiac ct images and they are not that big (256x256x56). I suspect that the usage is small because the allocation failed. I am out of the office so I can not give you the exact figure but I will give you an update as soon as I am back next week.
I use a single metric AdvancedMattesMutualInformation.
Also, if I set (NumberOfResolutions 1) then the registration seems to complete. However, the computation time is longer than on a CPU... I did not profile the code(not sure how to profile OpenCL code and if I can use nvidia profiler)
If you can find your standard parameter files for either affine or bspline that would be perfect!
Let me know if anything comes to your mind how to narrow it down or what further testing I can do from my end.
Looking forward to your response.
Best regards,
Pavel
Hi @PavelFalkovskiy
I did not found an actual parameter file, but I was able to extract this from some code I used to generate these parameters.
You will probably find that it is similar to what you used, except that the (OpenCLxxxUseOpenCL "true")
lines are missing as I mentioned previously.
Maybe you can try do only activate OpenCL for the resampler, this way if you see a crash at the end of registration you can be an allocation problem. If it happens before it might come from something else. And if it doesn't crash, your pyramid may be too large. If you have enough time, you could try to run the registration directly using command line ElastiX, and see if anything is bound to SimpleElastix or Python in some way or not.
Also, you mentioned that the registration completes with only 1 resolution, but that it is longer than using CPU. Note that loading images into GPU memory, then back into system memory takes time. It is worth it only if the computation that happens in between is faster on GPU than CPU. Basically you can see most effect with complex BSpline transforms, and large pyramids. With only one resolution you do not compute any pyramid, so you should not see a time gain on this part and all relies on the resampling.
Tell me how is goes.
(AutomaticParameterEstimation "true") (AutomaticScalesEstimation "true") (AutomaticTransformInitialization "true") (BSplineInterpolationOrder 1) (DefaultPixelValue 0) (FinalBSplineInterpolationOrder 3) (FixedImagePyramid "OpenCLFixedGenericImagePyramid") (FixedInternalImagePixelType "short") (HowToCombineTransforms "Compose") (ImageSampler "RandomCoordinate") (Interpolator "BSplineInterpolator") (MaximumNumberOfIterations 500) (Metric "AdvancedMattesMutualInformation") (MovingImagePyramid "OpenCLMovingGenericImagePyramid") (MovingInternalImagePixelType "short") (NewSamplesEveryIteration "true") (NumberOfHistogramBins 8) (NumberOfResolutions 4) (NumberOfSamplesForExactGradient 4096) (NumberOfSpatialSamples 2048) (Optimizer "AdaptiveStochasticGradientDescent") (Registration "MultiMetricMultiResolutionRegistration") (ResampleInterpolator "FinalBSplineInterpolator") (Resampler "OpenCLResampler") (ResultImageFormat "nii") (ResultImagePixelType "short") (Transform "EulerTransform") (UseDirectionCosines "True") (WriteResultImage "False")
Hi,
I've using python and simple elastix and I have been really happy with it. However, when I try to run Affine +Bspline registration on a Nvidia 980M GPU I get cl_out_of_resources error.
I took the default parameter file for both affine and bspline and replaced: (Resampler "DefaultResampler") with (OpenCLResamplerUseOpenCL "true") (Resampler "OpenCLResampler")
and (FixedImagePyramid "FixedRecursiveImagePyramid") (MovingImagePyramid "MovingRecursiveImagePyramid")
with (OpenCLMovingGenericImagePyramidUseOpenCL "true") (MovingImagePyramid "OpenCLMovingGenericImagePyramid") (OpenCLFixedGenericImagePyramidUseOpenCL "true") (FixedImagePyramid "OpenCLFixedGenericImagePyramid")
Am I doing something wrong? Did anyone succeed in running OpenCL with simple elastix? Can someone perhaps share their parameter file ?:)
Best,
Pavel