Mu-Sigma / RImpala

RImpala is an R package that helps you to connect and execute distributed queries using Cloudera Impala
Other
13 stars 14 forks source link

Why does RImpala.java use "hive" connectors? #11

Open lucacerone opened 8 years ago

lucacerone commented 8 years ago

Hi, I am sure this is not a proper bug because everything works fine.

I have a few curiosities about the code though. In RImpala.java you use JDBC_DRIVER_NAME = "org.apache.hive.jdbc.HiveDriver"; and also you build the connection url as: CONNECTION_URL = "jdbc:hive2://" + IP + ':' + port + "/;" + principal;

I wonder why you use Hive drivers and Hive2 protocol.

Reading at the documentation I had the impression one should use Impala drivers like: com.cloudera.impala.jdbc3.Driver

and build the string using jdbc:impala//etc_etc

Am I missing something here?

Also I have a small suggestion: why don't you put the relevant .jar files in a folder "jars" in the inst directory (http://r-pkgs.had.co.nz/inst.html) and have rimpala.init() default to file.path(system.file(package = "RImpala"),"jars")?

This way the user does not have to worry to install the jars and can still change them should she want to use a different location!

Anyway, thanks a lot for the great package!

Cheers, Luca