antonmks / Alenka

GPU database engine
Other
1.17k stars 120 forks source link

loading from file crash at thrust reduce_by_key #7

Closed georgezhlw closed 11 years ago

georgezhlw commented 11 years ago

Hi Anton, alenka crashed when loading, do you see what should be done to solve this? This is on Mac OS X. Test Case: Alenka]$ cat testtable.tbl 1|a 2|b 3|c Alenka]$ cat load_testtable.sql T := LOAD 'testtable.tbl' USING ('|') AS (id{1}:int, value{2}:varchar(20)); STORE T INTO 'testtable' BINARY; Alenka]$./alenka -l 1 load_testtable.sql Process count = 1 LOAD: T testtable.tbl 2 | STORE: T testtable LOADING testtable.tbl | libc++abi.dylib: terminate called throwing an exception Abort trap: 6

Analysis: (gdb) s pfor_delta_compress (source=0x500240000, source_len=8, file_name=0x7fff5fbfe9b8 "testtable.1.0", host=@0x102d22c00, tp=false, sz=0) at compress.cu:287

389 thrust::device_ptr fin_seq = thrust::device_malloc(cnt); (gdb) 391 thrust::reduce_by_key(add_seq, add_seq+recCount,s_copy1,thrust::make_discard_iterator(), (gdb) libc++abi.dylib: terminate called throwing an exception

Program received signal SIGABRT, Aborted. 0x00007fff956b8212 in __pthread_kill ()

thanks, George

georgezhlw commented 11 years ago

BTW, i got some warning when compile, not sure if it's related to the crash:

Alenka]$ make /usr/local/cuda/bin/nvcc -O3 -arch=sm_20 -m64 -I/usr/local/boost_1_53_0 -lcudpp -c merge.cu merge.cu(364): error: space required between adjacent ">" delimiters of nested template argument lists (">>" is the right shift operator)

/usr/local/cuda/bin/../include/thrust/detail/allocator/allocator_traits.h(124): error: name followed by "::" must be a class or namespace name detected during: instantiation of class "thrust::detail::allocator_traits [with Alloc=]" /usr/local/cuda/bin/../include/thrust/detail/contiguous_storage.h(39): here instantiation of class "thrust::detail::contiguous_storage<T, Alloc> [with T=float_type, Alloc=]" /usr/local/cuda/bin/../include/thrust/detail/vector_base.h(46): here instantiation of class "thrust::detail::vector_base<T, Alloc> [with T=float_type, Alloc=]" /usr/local/cuda/bin/../include/thrust/host_vector.h(53): here instantiation of class "thrust::host_vector<T, Alloc> [with T=float_type, Alloc=]"

modify source file, change non-shift ">>" to "> >" in merge.cu, cm.cu near "locate push_back("

make: /usr/local/cuda/bin/nvcc -O3 -arch=sm_20 -m64 -I/usr/local/boost_1_53_0 -lcudpp -c merge.cu /usr/local/cuda/bin/../include/thrust/detail/allocator/allocator_traits.inl(136): warning: calling a host function from a host device function is not allowed detected during: instantiation of "thrust::detail::enable_if<thrust::detail::allocator_traits_detail::has_member_destroy<Alloc, T>::value, void>::type thrust::detail::allocator_traits_detail::destroy(Alloc &, T ) [with Alloc=uninitialized_host_allocator, T=int_type]" (265): here instantiation of "void thrust::detail::allocator_traits::destroy(thrust::detail::allocator_traits::allocator_type &, T ) [with Alloc=uninitialized_host_allocator, T=int_type]" /usr/local/cuda/bin/../include/thrust/detail/allocator/destroy_range.inl(103): here instantiation of "void thrust::detail::allocator_traits_detail::destroy_via_allocator::operator()(T &) [with Allocator=uninitialized_host_allocator, T=int_type]" /usr/local/cuda/bin/../include/thrust/detail/function.h(47): here

antonmks commented 11 years ago

Hi George The warnings are ok, just complaining about uninitialized host allocators. I run your script and it works fine. I assume you compiled any CUDA programs on your machine before and it run ok ? See if you can compile and run some example Thrust programs. Also, if you use varchars, use lengths from 1 to 10,most of the other lengths are not yet defined in strings.cu file ( you can add it there if you want to)

georgezhlw commented 11 years ago

Anton, CUDA and basic Thrust program works. Since alenka fails at reduce_by_key, i tested thrust::reduce_by_key, it fails with same error, so this should be a thrust 1.7 problem on my machine.

thrust]$ cat reducebykey.cu

include < thrust/reduce.h>

include < thrust/unique.h>

include < thrust/iterator/discard_iterator.h>

include < thrust/host_vector.h>

include < thrust/device_vector.h>

int main(void) { thrust::device_vector keys; thrust::device_vector values; typename thrust::pair<typename thrust::device_vector::iterator, typename thrust::device_vector::iterator> new_last; keys.resize(9); keys[0] = 11; keys[1] = 11; keys[2] = 21; keys[3] = 20; keys[4] = 21; keys[5] = 21; keys[6] = 21; keys[7] = 37; keys[8] = 37; values.resize(9); values[0] = 0; values[1] = 1; values[2] = 2; values[3] = 3; values[4] = 4; values[5] = 5; values[6] = 6; values[7] = 7; values[8] = 8; thrust::device_vector output_keys(keys.size()); thrust::device_vector output_values(values.size()); new_last = thrust::reduce_by_key(keys.begin(), keys.end(), values.begin(), output_keys.begin(), output_values.begin()); return 0; } thrust]$ nvcc -arch=sm_20 -m64 reducebykey.cu -o reducebykey thrust]$ ./reducebykey libc++abi.dylib: terminate called throwing an exception Abort trap: 6 regards, George

antonmks commented 11 years ago

Which version of CUDA do you have on your machine ? nvcc --version

georgezhlw commented 11 years ago

$ nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2012 NVIDIA Corporation Built on Fri_Sep_28_16:10:16_PDT_2012 Cuda compilation tools, release 5.0, V0.2.1221

antonmks commented 11 years ago

Ok, that should work. Which GPU do you use ?

georgezhlw commented 11 years ago

GPU is GeForce GT 650M; global mem: 536543232B; compute v3.0; clock: 774500 kHz

I found that the failure is because i use -arch=sm_20 (or sm_30) together with -m64. reduce_by_key will success if i use only one of the option.

For compiling Alenka, if I don't use -arch, if fails at: /usr/local/cuda/bin/nvcc -O3 -m64 -I/usr/local/boost_1_53_0 -c strings.cu ptxas /tmp/tmpxft_0000079f_00000000-5_strings.ptx, line 0; fatal : (C9999) max reg limit too low

if i don't use -m64, it will compile into 32 bit, and by default cudpp compiled to 64 bit on my system. I really don't want to compile the whole thing to 32bit. So i am stuck (:

antonmks commented 11 years ago

You definitely should compile 64 bit version. I haven't seen this error (reg limit) before but it might be related to the limited hardware resources of your GPU. Because it fails to compile strings.cu I can suggest the following thing : comment out from strings.cu all processing of strings longer than, say, 10 characters and see if that compiles.