antonmks / Alenka

GPU database engine
Other
1.17k stars 120 forks source link

JDBC driver #45

Closed buggtb closed 10 years ago

buggtb commented 10 years ago

Hi,

We have written an open source analytics tool called Saiku, and we like demoing its capabilities over different or more interesting data sources, do you have a JDBC driver for Alenka?

Thanks

Tom

antonmks commented 10 years ago

Hi, No, unfortunately I do not have a JDBC or ODBC drivers. Right now alenka outputs the results to a text file.

Randolph42 commented 10 years ago

I have a plan to add the Drizzle front end - including JDBC. Are you still interested ?

buggtb commented 10 years ago

Very much so, the idea I had (but lack of time so far) was to plumb in Optiq (https://github.com/julianhyde/optiq) somehow, but if you have a better idea then I'm that would be great :)

Tom

On 26/11/13 03:24, Randolph42 wrote:

I have a plan to add the Drizzle front end - including JDBC. Are you still interested ?

— Reply to this email directly or view it on GitHub https://github.com/antonmks/Alenka/issues/45#issuecomment-29264865.

Randolph42 commented 10 years ago

I think I have a better solution using some existing code from my deepcloud database but be aware that this will not yet apply a query planner or new dialect, So for the meanwhile are you comfortable with something like? res := select* from t; Show res;

note: This will require a Œshow¹ method of displaying the results from the Alenka dialect and possibly a way of deleting tables in GPU RAM

On 26/11/13 6:11 PM, "Tom Barber" notifications@github.com wrote:

Very much so, the idea I had (but lack of time so far) was to plumb in Optiq (https://github.com/julianhyde/optiq) somehow, but if you have a better idea then I'm that would be great :)

Tom

On 26/11/13 03:24, Randolph42 wrote:

I have a plan to add the Drizzle front end - including JDBC. Are you still interested ?

‹ Reply to this email directly or view it on GitHub https://github.com/antonmks/Alenka/issues/45#issuecomment-29264865.

‹ Reply to this email directly or view it on GitHub https://github.com/antonmks/Alenka/issues/45#issuecomment-29271592 .

Randolph

pacificobuzz commented 10 years ago

Randolph,

Did you ever come up with something for a JDBC driver?

Randolph42 commented 10 years ago

We have a SQL and JDBC Alenka solution in development. It will probably be released within a few months. I have not prioritised the JDBC because I thought its use without SQL would be quite limited. JDBC can be released sooner if:

On 19/03/14 1:06 AM, "pacificobuzz" notifications@github.com wrote:

Randolph,

Did you ever come up with something for a JDBC driver?

‹ Reply to this email directly or view it on GitHub https://github.com/antonmks/Alenka/issues/45#issuecomment-37935943 .

pacificobuzz commented 10 years ago

Interesting. Are you going to run the alenka executable in your driver kind of like SQLite JDBC (process builder)? Intrigued to see how a JDBC driver would work with an DB like Alenka.

Randolph42 commented 10 years ago

Its going to be more MySQL/Oracle like. So very similar drivers to MySQL with a more Oracle like SQL Syntax.

On 11/04/14 5:51 AM, "pacificobuzz" notifications@github.com wrote:

Interesting. Are you going to run the alenka executable in your driver kind of like SQLite JDBC (process builder)? Intrigued to see how a JDBC driver would work with an DB like Alenka.

‹ Reply to this email directly or view it on GitHub https://github.com/antonmks/Alenka/issues/45#issuecomment-40123629 .

pacificobuzz commented 10 years ago

That sounds great! Would be fun to play around with that.

Are you going to modify the Alenka source at all to help accommodate this? Since there isn't really an API, I wonder how difficult the driver development would be. I understand how you would handle the queries, I guess I'm thick on the connection/driver part of the driver. Any thoughts?

Randolph42 commented 10 years ago

Yes I already have made a couple of mods and there is a couple more on the way. Driver is done (just not fully integrated yet) but I don¹t see the utility of it without SQL.

On 14/04/14 10:57 PM, "pacificobuzz" notifications@github.com wrote:

That sounds great! Would be fun to play around with that.

Are you going to modify the Alenka source at all to help accommodate this? Since there isn't really an API, I wonder how difficult the driver development would be. I understand how you would handle the queries, I guess I'm thick on the connection/driver part of the driver. Any thoughts?

‹ Reply to this email directly or view it on GitHub https://github.com/antonmks/Alenka/issues/45#issuecomment-40358062 .

pinkdevelops commented 10 years ago

This is a good discussion. Randolph will you be making your driver open source? I saw the library you committed with the AlenkaExecute function in the Alenka header file. We are working on a JNI implementation using that.

Randolph42 commented 10 years ago

For pure Alenka dialect, JNI is probably a better choice then JDBC. There will be more news about the project in a while.

On 15/04/14 12:52 AM, "Devin Pinkston" notifications@github.com wrote:

This is a good discussion. Randolph will you be making your driver open source? I saw the library you committed with the AlenkaExecute function in the Alenka header file. We are working on a JNI implementation using that.

‹ Reply to this email directly or view it on GitHub https://github.com/antonmks/Alenka/issues/45#issuecomment-40367424 .

pinkdevelops commented 10 years ago

Have you used the Library with JNI/JNA successfully? Trying to use a pointerbyreference for "execute_file" seems to be throwing some odd errors from Java. However AlenkaClose() works fine. Have you built a shared library with bison.cu/main.cu?

Randolph42 commented 10 years ago

AlenkaClose() is never used with execute_file()

ExecuteFile() is used as shown in main.cu

On 18/04/14 11:33 PM, "Devin Pinkston" notifications@github.com wrote:

Have you used the Library with JNI/JNA successfully? Trying to use a pointerbyreference for "execute_file" seems to be throwing some odd errors from Java. However AlenkaClose() works fine.

‹ Reply to this email directly or view it on GitHub https://github.com/antonmks/Alenka/issues/45#issuecomment-40805387 .

pinkdevelops commented 10 years ago

Thanks Randolph. Is ExecuteFile accepting just a char array? Should it be something like:

ExecuteFile("-v", "q1.sql")?

or should i be passing in an array with data like: strArray[1] = "-v"; //it looks like [1] of the array is for -v -l etc. strArray[2] = "q1.sql"; Thanks - I am working with JNA which is being a little difficult.

Randolph42 commented 10 years ago

Yes, but have a look in alenka.h at the 1st method (alenkaInit is called with the Œav¹ parameter as in main.cu) Alternatively, you can start alenka with the Œ-I¹ flag to make it work interactively from stdin

On 22/04/14 12:47 AM, "Devin Pinkston" notifications@github.com wrote:

Thanks Randolph. Is ExecuteFile accepting just a char array? Should it be something like:

ExecuteFile("-v", "q1.sql")?

Thanks - I am working with JNA which is being a little difficult.

‹ Reply to this email directly or view it on GitHub https://github.com/antonmks/Alenka/issues/45#issuecomment-40936215 .

pinkdevelops commented 10 years ago

thanks for that Randolph. Out of curiosity, have you built a shared library with this? if so, what files are you including in your shared lib?

thanks

Randolph42 commented 10 years ago

No, not at this stage.

But JDBC is getting closer.

pinkdevelops commented 10 years ago

Randolph,

With the library, execute_file seems to not be receiving the String values from main. have you seen that? Or maybe i have something off on my end. Anyways from my output, i added some print statements in main.cu that start with "MAIN", you can see i pass in -v and the q1.sql. then i get into execute file, and the array is empty?


MAIN: av[0] = -v MAIN: av[1] = /home/devin/gpu/Alenka-master/q1.sql inside execute file av[0] = av[1] =

Coudn't open data dictionary

So that confuses me, because I still receive this error: Filter : couldn't find variable lineitem

So it definitely sees that q1 is used the table lineitem, the data is loaded properly and I can use ./alenka q1.sql and that works fine.

Any thoughts?

Thanks

Randolph42 commented 10 years ago

If you receive an error this way, its probably your data that is the problem. This theory is supported by the warning about not finding the data.dictionary Have you loaded the data? Verify this with the std alenka batches.

pinkdevelops commented 10 years ago

Thanks Randolph

pinkdevelops commented 10 years ago

Randolph,

Have you run the alenkaexecute function with just a String representation of a query? All the other library calls work perfectly, however alenkaexecute seems to be tricky to get to work when submitting an Alenka query.

Randolph42 commented 10 years ago

Yes, I will submit a patch for this shortly.

On 13/05/14 3:17 AM, "Devin Pinkston" notifications@github.com wrote:

Randolph,

Have you run the alenkaexecute function with just a String representation of a query? All the other library calls work perfectly, however alenkaexecute seems to be tricky to get to work when submitting an Alenka query.

‹ Reply to this email directly or view it on GitHub https://github.com/antonmks/Alenka/issues/45#issuecomment-42860643 .

Randolph42 commented 10 years ago

Try it now

On 13/05/14 3:17 AM, "Devin Pinkston" notifications@github.com wrote:

Randolph,

Have you run the alenkaexecute function with just a String representation of a query? All the other library calls work perfectly, however alenkaexecute seems to be tricky to get to work when submitting an Alenka query.

‹ Reply to this email directly or view it on GitHub https://github.com/antonmks/Alenka/issues/45#issuecomment-42860643 .

antonmks commented 10 years ago

Ok, it looks like we got a JDBC driver thanks to the guys from Technica Corporation : https://github.com/Technica-Corporation/Alenka-JDBC

Randolph42 commented 9 years ago

Our alenka based SQL system whirlwindDB is released. It has JDBC and uses std Oracle (subset) dialect. Beta version is GPL and available at http://deepcloud.co

AlexeyAB commented 9 years ago

Hi all! Good news :)

Randolph42, As I see you use MPI via OpenFabrics Interconnects.

  1. Do you use One-Sided Communication (RDMA) from MPI-2, MPI_Win_create(), which can be tens times faster than MPI_Send()/Recv()?
  2. I.e. did you test your system with One-Sided operations via uDAPL from OpenFabrics?

P.S. It is very interesting, because now we develop soft/hardware platform with CPU+GPU+FPGA via PCI-Express, where many of these units (CPU/GPU/FPGA) connected by PLX PCIe-Switches. PCIe integrated directly to the crystal and communicates with cores through Last Level Cache (for example L3-CPU), and it gives the maximum bandwidth and the minimum latency (0.5-1 usec) among all possible, over long distances via PCIe optical cable. I.e. it is the fastest and the shortest way of communications. And our software: IP over PCIe, uDAPL over PCIe. But firstly, GPUDirect 3.0 (RDMA: GPU0-->CPU0-->CPU1-->GPU1) can works via MPI only for small packets (as said me lead of nVidia GPUDirect). And secondly, uDAPL/MPI-2 allow to use One-Sided operation only by use window-descriptor and special functions, but does not allow to use raw-pointers, which we will can use to remap from/to GPU-UVA for GPUDirect 1/2/3. Then we think about develope a new proprietary protocol similar to SISCI by using raw-pointers, which work through PCIe-BARs (BAR1 on GPU for GPUDirect 3.0) to send data by way: GPU0-->PCIeSwitch-->CPU0-->PCIeSwitch-->CPU1-->PCIeSwitch-->GPU1.

Randolph42 commented 9 years ago

No, I didn¹t know it was so fast. But anyway, its only used with our MPP system that does NOT currently use GPUs When I implement MPP with whirlwindDB I will consider the one-sided comms.

Your stuff sounds like a great replacement for infiniband. I¹d like to know more about what your doing but we are getting a bit off-line. Email me if you like. I will put up an email address in a separate message for a few hours and then remove the message. Randolph

On 2/10/14 11:22 PM, "Alexey" notifications@github.com wrote:

Hi all! Good news :)

Randolph42, As I see you use MPI via OpenFabrics Interconnects.

  1. Do you use One-Sided Communication (RDMA) from MPI-2, MPI_Win_create(), which can be tens times faster than MPI_Send()/Recv()?
  2. I.e. did you test your system with One-Sided operations via uDAPL from OpenFabrics?

P.S. It is very interesting, because now we develop soft/hardware platform with CPU+GPU+FPGA via PCI-Express, where many of these units (CPU/GPU/FPGA) connected by PLX PCIe-Switches. PCIe integrated directly to the crystal and communicates with cores through Last Level Cache (for example L3-CPU), and it gives the maximum bandwidth and the minimum latency (0.5-1 usec) among all possible, over long distances via PCIe optical cable. I.e. it is the fastest and the shortest way of communications. And our software: IP over PCIe, uDAPL over PCIe. But firstly, GPUDirect 3.0 (RDMA: GPU0-->CPU0-->CPU1-->GPU1) can works via MPI only for small packets (as said me lead of nVidia GPUDirect). And secondly, uDAPL/MPI-2 allow to use One-Sided operation only by use window-descriptor and special functions, but does not allow to use raw-pointers, which we will can use to remap from/to GPU-UVA for GPUDirect 1/2/3. Then we think about develope a new proprietary protocol similar to SISCI by using raw-pointers, which work through PCIe-BARs (BAR1 on GPU for GPUDirect 3.0) to send data by way: GPU0-->PCIeSwitch-->CPU0-->PCIeSwitch-->CPU1-->PCIeSwitch-->GPU1.

‹ Reply to this email directly or view it on GitHub https://github.com/antonmks/Alenka/issues/45#issuecomment-57628300 . {"@context":"http://schema.org","@type":"EmailMessage","description":"View this Issue on GitHub","action":{"@type":"ViewAction","url":"https://github.com/antonmks/Alen ka/issues/45#issuecomment-57628300","name":"View Issue"}}

Randolph Pullen Architect & Founder DeepCloud E: randolph@deepcloud.com.au P: +61 42089 5221

www.DeepCloud.co

AlexeyAB commented 9 years ago

To be more precise, it isn’t a replacement for Infiniband, but it is an effective removal of Infiniband/Ethernet and any NICs. Because many of our works under an NDA, then I can mail only some of slides.