antonmks / Alenka

GPU database engine
Other
1.17k stars 120 forks source link

Alenka unable to find a table #102

Closed marklit closed 7 years ago

marklit commented 7 years ago

I've compiled the development branch of Alenka (commit 0a097f1) on Ubuntu 16.04 64-bit with CUDA 8 and I'm running it with an Nvidia GTX 1080 and the 367.48 driver. I'm unable to query a table I've created with Alenka. Below are the steps to re-create the issue. Any guidance on where I'm going wrong would be greatly appreciated.

I created a new folder and have three files in it, load.sql, data.tbl and query.sql.

$ cd ~/nation

This is the SQL load file:

$ cat load.sql
A  :=  LOAD 'data.tbl' USING ('|') AS (n_nationkey{1}:int, n_name{2}:varchar(25), n_regionkey{3}:int);
STORE A INTO 'country' BINARY;

This is the data it'll be loading. I added pipes to the end of each line because if I didn't, Alenka would get stuck in a loop.

$ cat data.tbl
0|ALGERIA|0|
1|ARGENTINA|1|
2|BRAZIL|1|
3|CANADA|1|
4|EGYPT|4|
5|ETHIOPIA|0|
6|FRANCE|3|

I then ran the load command. I did get two warnings but no errors.

$ ~/Alenka/src/alenka load.sql 
GeForce GTX 1080 : 1835.000 Mhz   (Ordinal 0)
20 SMs enabled. Compute Capability sm_61
FreeMem:   6943MB   TotalMem:   8110MB   64-bit pointers.
Mem Clock: 5005.000 Mhz x 256 bits   (320.3 GB/s)
ECC Disabled

- 2016-10-16 13:31:52.568 INFO: Executing File..
- 2016-10-16 13:31:52.568 WARNING: Couldn't open data dictionary
- 2016-10-16 13:31:52.568 WARNING: Error, no valid column names have been found 
- 2016-10-16 13:31:53.006 INFO: Execute Complete!

I could then see a number of new files in the folder. I ran hexdump over them and, although I'm not familiar with your file formats, they did look to contain data.

$ ls -alh
total 56K
drwxrwxr-x  2 mark mark 4,0K okt   16 13:32 .
drwxr-xr-x 32 mark mark 4,0K okt   16 12:39 ..
-rw-rw-r--  1 mark mark  107 okt   16 13:31 alenka.dictonary
-rw-rw-r--  1 mark mark  175 okt   16 13:31 country.n_name
-rw-rw-r--  1 mark mark   60 okt   16 13:31 country.n_name.0.hash
-rw-rw-r--  1 mark mark   63 okt   16 13:31 country.n_name.0.idx
-rw-rw-r--  1 mark mark   20 okt   16 13:31 country.n_name.header
-rw-rw-r--  1 mark mark   63 okt   16 13:31 country.n_nationkey.0
-rw-rw-r--  1 mark mark   20 okt   16 13:31 country.n_nationkey.header
-rw-rw-r--  1 mark mark   63 okt   16 13:31 country.n_regionkey.0
-rw-rw-r--  1 mark mark   20 okt   16 13:31 country.n_regionkey.header
-rw-rw-r--  1 mark mark   89 okt   16 13:31 data.tbl
-rw-rw-r--  1 mark mark  134 okt   16 13:31 load.sql
-rw-rw-r--  1 mark mark   74 okt   16 13:32 query.sql
$ hexdump -C country.n_name
00000000  41 4c 47 45 52 49 41 00  00 00 00 00 00 00 00 00  |ALGERIA.........|
00000010  00 00 00 00 00 00 00 00  00 41 52 47 45 4e 54 49  |.........ARGENTI|
00000020  4e 41 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |NA..............|
00000030  00 00 42 52 41 5a 49 4c  00 00 00 00 00 00 00 00  |..BRAZIL........|
00000040  00 00 00 00 00 00 00 00  00 00 00 43 41 4e 41 44  |...........CANAD|
00000050  41 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |A...............|
00000060  00 00 00 00 45 47 59 50  54 00 00 00 00 00 00 00  |....EGYPT.......|
00000070  00 00 00 00 00 00 00 00  00 00 00 00 00 45 54 48  |.............ETH|
00000080  49 4f 50 49 41 00 00 00  00 00 00 00 00 00 00 00  |IOPIA...........|
00000090  00 00 00 00 00 00 46 52  41 4e 43 45 00 00 00 00  |......FRANCE....|
000000a0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00     |...............|
000000af
$ hexdump -C country.n_name.header
00000000  07 00 00 00 00 00 00 00  01 00 00 00 07 00 00 00  |................|
*
00000014

I then ran the following query.

$ cat query.sql
A1 := SELECT  count(n_name) AS col1 FROM country;
DISPLAY A1 USING ('|');
$ ~/Alenka/src/alenka query.sql

I end up getting a (Fatal) Couldn't find(1) country error message:

GeForce GTX 1080 : 1835.000 Mhz   (Ordinal 0)
20 SMs enabled. Compute Capability sm_61
FreeMem:   6943MB   TotalMem:   8110MB   64-bit pointers.
Mem Clock: 5005.000 Mhz x 256 bits   (320.3 GB/s)
ECC Disabled

- 2016-10-16 13:33:11.608 INFO: Executing File..
- 2016-10-16 13:33:11.609 ERROR: Couldn't find1 country
(Fatal) Couldn't find(1) country

Here are the last few lines of that command run via strace. I looks to open country.n_name.header fine but cannot find country.sort nor country.presort. It then looks to execute the query and then throws the error.

$ strace ~/Alenka/src/alenka query.sql 
...
open("query.sql", O_RDONLY)             = 23
fstat(23, {st_mode=S_IFREG|0664, st_size=74, ...}) = 0
read(23, "A1 := SELECT  count(n_name) AS c"..., 8192) = 74
read(23, "", 4096)                      = 0
read(23, "", 8192)                      = 0
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {0, 154970327}) = 0
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {0, 154976672}) = 0
open("country.n_name.header", O_RDONLY) = 24
fstat(24, {st_mode=S_IFREG|0664, st_size=20, ...}) = 0
read(24, "\7\0\0\0\0\0\0\0\1\0\0\0\7\0\0\0\7\0\0\0", 4096) = 20
close(24)                               = 0
open("country.sort", O_RDONLY)          = -1 ENOENT (No such file or directory)
open("country.presort", O_RDONLY)       = -1 ENOENT (No such file or directory)
open("query.sql", O_RDONLY)             = 24
fstat(24, {st_mode=S_IFREG|0664, st_size=74, ...}) = 0
read(24, "A1 := SELECT  count(n_name) AS c"..., 8192) = 74
read(24, "", 4096)                      = 0
write(2, "- 2016-10-16 13:33:41.187 ERROR:"..., 56- 2016-10-16 13:33:41.187 ERROR: Couldn't find1 country
) = 56
write(1, "(Fatal) Couldn't find(1) country"..., 33(Fatal) Couldn't find(1) country
) = 33
exit_group(1)                           = ?
+++ exited with 1 +++

Again, any help in fixing this issue would be greatly appreciated.

Kind Regards, Mark

marklit commented 7 years ago

I re-compiled using the master branch and my queries are running now.