antonmks / Alenka

GPU database engine
Other
1.17k stars 120 forks source link

Out of memory error #90

Closed KenjiTakahashi closed 8 years ago

KenjiTakahashi commented 8 years ago

I have generated some test data with dgben -s 100 (so ~100GB), but when I try to put it in Alenka, I get:

$ ./Alenka load_customer.sql
Couldn't open data dictionary
LOAD: C customer.tbl 8  |
Append 0
STORE: C customer
set a piece to 1000140800 3000295424
Cuda error in file 'cm.cu' in line 2289 : out of memory.

I'm using GTX 970 with 4GB memory. I've also checked with nvidia-smi that it's indeed running OOM.

So, the question is, is there any convenient way of loading huge data into Alenka? Or should I split it into pieces and then APPEND them one by one?

KenjiTakahashi commented 8 years ago

OK, I've found about the -l option and it seems that this is what I want. Different question, then. I've noticed that the smaller -l argument is, the more files get created. Doesn't this affect query performance? Should I shoot for largest possible, smallest possible, or it doesn't matter in practice?

antonmks commented 8 years ago

Yes, it does affect query performance, but not by much. So you should try to shoot for the largest possible piece. 1GB file chunks are default, usually works for 6GB cards. -l 500 should be fine with your card.

KenjiTakahashi commented 8 years ago

Thanks for the informations. After some trial-and-error, I figured I can go away with 600 on 4GB card (700 is already too much). We'll see how that fare when I'll have some time to get back to it for further tests.