Closed calgray closed 4 years ago
The still-broken build seems to be caused by the problem described here (ultimately an issue in gcc 5.5). But on that job you are already using g++-6, you could try using gcc-6 as well as the CMAKE_C_COMPILER
and as the host compiler used by nvcc
.
This is a changeset to get CUDA leap code running identical to the single threaded implementation to about 6 decimal places.
One of the optimizations also made here is to load the visiibilities and uvws once into a single integration object in order to perform one cuda call per direction. this may not be possible to do with the too many baselines and channels as the gpu would run out of memory, however there is a task to report the compute and memory footprint for beta release where the batching of baselines can be restored afterwards.
This also adds a bugfix where only the first direction in the casa implementation was being calculated correctly.