Closed bHimes closed 1 year ago
@jojoelfe I"m getting a "no space left on disk" which is killing the icpc container/CI. Any ideas?
I think github recently started to be more strict about the disk space runners get. Its about 14GB and the icpc image is already 6GB I think and due to the static compiling the binaries are large, too.
I'll look into whether we can get more space somehow, but maybe we just have to trim the image.
You mentioned that projection during template matching is now done on the GPU. Can you maybe point to the change that enables this?
Ah, it isn't in match template yet, the functionality to do GPU projection however is now in place.
@jojoelfe a reference implementation can be seen in the functional test:
src/programs/samples/1_cpu_gpu_comparison/projection_comparison.cpp
This required new image methods and meta data to swap momemntum (Fourier) space quadrants. This method
Image::SwapFourierSpaceQuadrants
also does an additional shift by one pixel so the x=-1 components are included. This is needed to efficiently use texture memory for gpu interpolation.
Normally we think of a real space shift by fourier multiplication, but you can do the same thing by a complex multiplication in real space to shift the Fourier spectrum. This adds the complication that the input image is now complex, for which cisTEM has no out-of-the-box FFT routines, hence the tmp_real and tmp_imag images, which use the linearity of the FFT to make a complex FFT from two real FFT's.
The corresponding method in the GpuImage class is still ExtractSlice and shares most of the same syntacx.
This also required a debug assert on Image::BackwardFFT as there is no good way to "undo" this shift and workaround complex FFT.
Ultimately, we need complex -> complex FFT routines, but it will be easier to use them through my FastFFT library (working on finishing up with TIm r/n) than to modify the Image class directly.
Description
This PR adds two primary functionalities to cisTEM:
1) Enable use of fp16 2) Enable isolation of GPU enabled code, even in core library functions 3) Expand testing in samples functional testing and adds unit testing functionality via catch2
The PR is unfortunately large, touching many files, but I decided it was safer to pull in many changes at once (which have been tested for ~ a year in my downstream) than to try to split all of this up, which would almost certainly produce errors and would certainly take much longer.
fp16 particle stacks
cisTEM can now read and write MRC mode 12 which is half-precision floating point format.
fp16 Image methods
cistem::Image can now allocate a fp16 buffer and has a method to calculate/apply a CTF in half-precision
cistem::GPUImage has a pile of fp16 methods added (mostly templated.)
Enable Isolation of gpu code
Previously, when building with ENABLEGPU, this define was used cisTEM wide, which meant that any GPU related code had to reside only in an individual program. Now, two new precompiler defines are created.
WANT_CISTEM_GPU_AM- set -DENABLEGPU program by program inside Makefile.am and link in libgpucore.a
SHOW_CISTEM_GPU_OPTIONS - this is used to set GPU tick boxes etc in the gui, but does not reveal any gpu code itself.
This also means we can now compile both a CPU version and a GPU version at the same time simplifying the design proces. E.g. match_template and match_template_gpu are both created.
This also means we can extend core methods, where something like core/StopWatch.cpp has gpu related extensions to synchronize the GPU in core/gpu/core_extensions/StopWatch.cu. Although not yet pulled in, this is also how I extended EulerSearch.cpp that runs the brute force search, to work on CPU or GPU by templating the function on the input image type, and specializing for GpuImage inside src/gpu/core_extensions/euler_search.cu
Fixes # (issue)
I have rebased my feature branch to be current with the master branch using to minimize conflicts and headaches
Which compilers were tested
These changes are isolated to the
How has the functionality been tested?
Please describe the tests that you ran to verify your changes. Please also note any relevant details for your test configuration.
Checklist: