Issue Summary
On the current master branch, running ./dedisperse-gpu ../input_files/BenMeerKAT.txt (from within the build/ directory) starts the program which then ends in
De-dispersing...
CUDA error at host_main_function.cu:234 code=13(cudaErrorInvalidSymbol) "cudaGetLastError()"
Full console output here:
./dedisperse-gpu ../input_files/BenMeerKAT.txt
Using standard GPU code
range: 5
debug: 1
multi_file: 1
analysis: 1
output_dmt: 0
sigma_cutoff: 6.000000
power: 2.000000
User requested DM search range:
0.000000 370.000000 0.307000 1
370.000000 740.000000 0.652000 2
740.000000 1480.000000 1.266000 4
1480.000000 2950.000000 2.512000 8
2950.000000 5000.000000 4.000000 16
Got user input: 8.000000000000001e-05(s)
12 HEADER_START
11 source_name
37 P: 3000.000000000000 ms, DM: 1500.000
10 machine_id
12 telescope_id
9 data_type
4 fch1
4 foff
6 nchans
5 nbits
6 tstart
5 tsamp
4 nifs
10 HEADER_END
Using standard GPU code
tsamp: 0.000064
tstart: 50000.000000
fch1: 1564.000000
foff: -0.208984
nchans: 2048
nifs: 1
nbits: 8
nsamples: 0
nsamp: 937984
Got file header info: 0.248815(s)
Using standard GPU code
Maxshift efficiency: 100.00%
Host Input size: 3664 MB
Host Output size: 0 MB
Device Input size: 0 MB
Device Output size: 0 MB
Allocated memory: 0.248893(s)
Using standard GPU code
Got input filterbank data: 1.301598(s)
Detected 1 CUDA Capable device(s)
Device 0: "GeForce GTX TITAN X"
CUDA Driver Version / Runtime Version 8.0 / 8.0
CUDA Capability Major/Minor version number: 5.2
Total amount of global memory: 12207 MBytes (12799770624 bytes)
GPU Clock rate: 1076 MHz (1.08 GHz)
Memory Clock rate: 3505 Mhz
Memory Bus Width: 384-bit
L2 Cache Size: 3145728 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 2 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
CUDA Device Driver Mode (TCC or WDDM): WDDM (Windows Display Driver Model)
Device supports Unified Addressing (UVA): Yes
Device PCI Bus ID / PCI location ID: 1 / 0
Using standard GPU code
Initialised GPU: 1.423872(s)
Maximum number of dm trials in any of the range steps: 1240
Range: 4, MAXSHIFT: 118496, Scrunch value: 16
Maximum dispersive delay: 7.58 (s)
Diagonal DM: 90.793449
In 4
Maxshift memory needed: 462 MB
Output memory needed: 925 MB
Using standard GPU code
maximum DM: 5119.000000
maxshift: 118496
max_ndms: 1240
Actual DM range that will be searched:
0.000000 380.680023 0.307000 1240
380.680023 771.880005 0.652000 600
771.880005 1531.479980 1.266000 600
1531.479980 3038.680176 2.512000 600
3038.680176 5118.680176 4.000000 520
Calculated strategy: 1.423929(s)
Using standard GPU code
Maxshift efficiency: 87.37%
Host Input size: 3664 MB
Host Output size: 5603 MB
Device Input size: 0 MB
Device Output size: 0 MB
Allocated memory: 1.428153(s)
774368
Using standard GPU code
Maxshift efficiency: 87.37%
Host Input size: 3664 MB
Host Output size: 5603 MB
Device Input size: 3024 MB
Device Output size: 6049 MB
Allocated memory: 1.433887(s)
----------------------- MSD info ---------------------------
Memory required by boxcar filters:17063.320 MB
Memory available:2765.812 MB
Max samples: :105967552
DMs_per_cycle: 160
Size MSD: 1024 Size workarea: 781, int: 32
------------------------------------------------------------
De-dispersing...
CUDA error at host_main_function.cu:234 code=13(cudaErrorInvalidSymbol) "cudaGetLastError()"
Steps to Reproduce
Clone the astro-accelerate repository, compile as usual.
Run dedisperse-gpu with one of the input files.
Expected Outcome
Expect a graceful exit with output or an explanation why the program cannot continue.
Actual Outcome
Ends in an error as described above.
Configuration
No changes to the default configuration, running on the astraios machine with CUDA 8.0
Issue Summary On the current master branch, running
./dedisperse-gpu ../input_files/BenMeerKAT.txt
(from within thebuild/
directory) starts the program which then ends inFull console output here:
Steps to Reproduce Clone the astro-accelerate repository, compile as usual.
Run
dedisperse-gpu
with one of the input files.Expected Outcome Expect a graceful exit with output or an explanation why the program cannot continue.
Actual Outcome Ends in an error as described above.
Configuration No changes to the default configuration, running on the
astraios
machine with CUDA 8.0Notes N/A.