Closed karlrupp closed 5 years ago
Also the CPU path is not valgrind clean:
$> valgrind ./mdtwObj -t CLASSIFICATION -i CPU 3 1 -f data/classification/rm_1/X_MAT data/classification/rm_1/Y_MAT data/classification/rm_1/Z_MAT -k 10 0 -o 1000 152 -m 0 DTW -v 0
==28320== Memcheck, a memory error detector
==28320== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==28320== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==28320== Command: ./mdtwObj -t CLASSIFICATION -i CPU 3 1 -f data/classification/rm_1/X_MAT data/classification/rm_1/Y_MAT data/classification/rm_1/Z_MAT -k 10 0 -o 1000 152 -m 0 DTW -v 0
==28320==
The number of iteration is greater than testSize! Verbose mode will be suppressed for this run
Reading data...
Dataset size: [1000,152,3]
Classification w/ DEPENDENT-DTW using CPU
==28320== Invalid read of size 4
==28320== at 0x409BA6: accumarray (module.cu:1253)
==28320== by 0x409EA9: crossvalind_Kfold (module.cu:1343)
==28320== by 0x404348: main (MD_DTW.cu:364)
==28320== Address 0x60ccf90 is 0 bytes after a block of size 4,000 alloc'd
==28320== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==28320== by 0x409E14: crossvalind_Kfold (module.cu:1332)
==28320== by 0x404348: main (MD_DTW.cu:364)
==28320==
...
cuda-memcheck ./mdtwObj -t CLASSIFICATION -i GPU 3 512 1 -f data/classification/rm_1/X_MAT data/classification/rm_1/Y_MAT data/classification/rm_1/Z_MAT -k 10 0 -o 1000 152 -m 0 DTW -d 0 -v 0
When i run the software i got a different error, that was due to the first invalid parameter for the cudaMemset function.
========= Program hit cudaErrorInvalidValue (error 11) due to "invalid argument" on CUDA API call to cudaMemset.
========= Saved host backtrace up to driver entry point at error
========= Host Frame:/usr/lib/x86_64-linux-gnu/libcuda.so.1 [0x330453]
========= Host Frame:./mdtwObj [0x3ed3c]
========= Host Frame:./mdtwObj [0x4470]
========= Host Frame:/lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main + 0xf5) [0x21f45]
========= Host Frame:./mdtwObj [0x6f25]
I fixed it by replacing the callbackcudaMemset(&d_test, 0, n_feat * window_size * sizeof(float))
with the following one: cudaMemset(d_test, 0, n_feat * window_size * sizeof(float))
By the way, I suppose the out-of-bounds threads error you get by running the cuda-memcheck
function is due to the limitation of the number of threads (< 1024) on your GPU. If so, i will provide a fix asap.
Anyway, could please provide me the Maximum number of threads per block
of your GPU?
When running the examples with cuda-memcheck, I encounter various errors. For example,
cuda-memcheck ./mdtwObj -t CLASSIFICATION -i GPU 3 512 1 -f data/classification/rm_1/X_MAT data/classification/rm_1/Y_MAT data/classification/rm_1/Z_MAT -k 10 0 -o 1000 152 -m 0 DTW -d 0 -v 0
results in many errors of the formThose need to be investigated on smaller samples. Compile with
nvcc -g -G ...
for GPU stack traces.Part of review at: https://github.com/openjournals/joss-reviews/issues/1049