IBMDataScience / DSx-Desktop

IBM Data Science Experience Desktop was built for those who want to download and play locally. Analyze, learn, and build with the tools you love, right on your desktop.
33 stars 18 forks source link

can i connect to hive from IBM DSX desktop? #26

Closed som6321 closed 6 years ago

som6321 commented 7 years ago

i installed a trial version of IBM DSX Desktop, my requirement is to connect with hive from DSX Desktop... is that possible to connect?

prashant182 commented 7 years ago

Sorry for the late response.

Yes, It is certainly possible. Let me walk you through the process. Dsx-desktop supports hive connection through spark and jdbc.

Make sure you have hive installed on your remote machine or docker container with hiveserver2. You need to find the hive jdbc connection string inorder to connect to your hive server.

For example, it could look like this jdbc:hive2://localhost:10000/default You can substitute your server with localhost:10000 with your hiveserver url and default with your database in above connection string. The connection needs org.apache.hive.jdbc.HiveDriver dependency, which dsx-desktop already supports.

For example a pyspark code for the hive connection could look like this.

from pyspark.sql import SparkSession
from pyspark import SparkContext
sc = SparkContext.getOrCreate()
spark = SparkSession.builder.config(conf=sc.getConf()).getOrCreate()

remote_hive = "jdbc:hive2://localhost:10000/default"
driver = "org.apache.hive.jdbc.HiveDriver"
user="hiveUname"
password = "hivePass"

df = spark.read.format("jdbc").\
    options(url=remote_hive, 
            driver=driver, 
            user=user, 
            password=password,
            dbtable="table").load()

You can substitute table,user and password according to your configuration. Please let us know if you need any more help on this one.