Build indices with large scale datasets

JinwoongKim / Massively-Parallel-Query-Processing-on-Heterogeneous-Architecture

Homepage

1 stars 2 forks source link

Closed JinwoongKim closed 8 years ago

JinwoongKim commented 8 years ago

Currently, MPHR-tree and Hybrid tree can build an index with a large scale datasets.

For both MPHR-tree and Hybrid-tree, we have to dump the index into a host device.

JinwoongKim commented 8 years ago

What I mean was, after building a small portion of MPHR or Hybrid, we need to dump that intermediate indexing into a host device.

JinwoongKim commented 8 years ago

For now, If thrust library fails to allocate Branches on the GPU, branches are sorted on CPU.

It must be performed on the GPU to get better performance Since CPU sorting is 10 times slower than GPU.

We need to think how to sort the big data on GPU.

JinwoongKim commented 8 years ago

Interestingly, multi-thread bottom-up construction with 32million dataset on CPU takes 0.3 sec while GPU version takes 0.9 sec

JinwoongKim commented 8 years ago

Now, I've implemented CPU-version index building for a large scale dataset.

We might replace some codes with GPU-version for a performance reason

JinwoongKim commented 8 years ago

Now our codes can build a large index very well on CPU

It seems to be working fine and shows pretty good performance.

We no longer need to worry about a performance of it.