antonmks / Alenka

GPU database engine
Other
1.17k stars 120 forks source link

hi #5

Closed sam1988 closed 11 years ago

sam1988 commented 11 years ago

hi anton, i'm a freshman studying in gpu, and recently i'm looking into your project, it's impressed me a lot, but it makes me hard to understand your project because lack of documents except manual file. if you have some detail documents,please send me,thank you.

and also what kind of IDE you use when you coding this project?

does this project support large scale like 1G or 10G?

my email: 870945154@qq.com

antonmks commented 11 years ago

Hi ! It is a research project so it doesn't have much of a documentation. I suggest reading papers on columnar databases and database compression algorithms. Most of the logic is in bison.y and cm.cu files. I test most of the queries on a scale 100 (100GB of data).

Regards,

Anton

On Wed, Mar 20, 2013 at 10:43 AM, sam1988 notifications@github.com wrote:

hi anton, i'm a freshman studying in gpu, and recently i'm looking into your project, it's impressed me a lot, but it makes me hard to understand your project because lack of documents except manual file. if you have some detail documents,please send me,thank you. and also what kind of IDE when you doing this project?

my email: 870945154@qq.com

— Reply to this email directly or view it on GitHubhttps://github.com/antonmks/Alenka/issues/5 .

sam1988 commented 11 years ago

100G?wow! but it seems that it store the origin related data in a binary file and then compress(using pfor and dictionary ) and then doing the filter or some other operation on gpu by transfer the compressed data to gpu. and is the compressed data enough small to alloc it on gpu?(because gpu memory is limited) or am i thinking wrong?

sorry to bother you

antonmks commented 11 years ago

Alenka stores data in segments small enough to fit GPU memory. Data are processed segment by segment, the totals are calculated in the end.

sam1988 commented 11 years ago

i see, have you ever compare the query performs to some other on columnar databases like monetdb or c-store or some other mature database?

antonmks commented 11 years ago

Yes, I compared Alenka against a few other databases using TPC-H tests: http://www.tpc.org/tpch/results/tpch_perf_results.asp?resulttype=noncluster

For example, Query1 on a 100 GB dataset runs 11 seconds on GTX580. You can compare it with other results using the link.