Copyright headers are added on top of each new file.
Overview of the changes done by the patch
3.1 The interface to connect to the hive has been changed from CLI to JDBC.
The functions to interact with the hive database are written in a Java class
and are accessed from the fdw code using JNI.
Advantages of switching to JDBC from CLI
a. Future dev in this area will be easier.
b. Connection pooling which is currently disabled can be enabled back.
c. Different modes of authentication can be supported including LDAP.
d. The new interface is more efficient and hence fast.
e. Easy to support large objects.
f. Easy to port to other OS.
3.2 In order to retrieve a column value memory was being allocated
value = (char*) palloc(len);
which is not needed any more, because memory has already been allocated in result set.
The changed system returns pointer to the same allocated memory.
This eliminates the need to first get the field length using hdfs_get_field_data_len
and then getting the field value itself.
This makes the system more efficient.
3.3 Error handling is improved. There is no need to allocated a fixed sized memory to store
error messages any more.
3.4 HDFS_TINYINT data type was being handled differently, which is not needed any more.
3.5 The dependency on thrift and all related files are removed from the repository.
cd /usr/local/pg96/bin
Add the following line in postgresql.conf (as a single line)
hdfs_fdw.classpath='/usr/local/edb95/lib/postgresql/HiveJdbcClient-1.0.jar:
/home/edb/Projects/hadoop_fdw/hadoop/share/hadoop/common/hadoop-common-2.6.4.jar:
/home/edb/Projects/hadoop_fdw/apache-hive-1.0.1-bin/lib/hive-jdbc-1.0.1-standalone.jar'
then
export LD_LIBRARY_PATH=/home/edb/Projects/hadoop_fdw/jdk1.8.0_111/jre/lib/amd64/server/:/usr/local/edb95/lib/postgresql/
This patch adds support for LDAP authentication.
Copyright headers are added on top of each new file.
Overview of the changes done by the patch 3.1 The interface to connect to the hive has been changed from CLI to JDBC. The functions to interact with the hive database are written in a Java class and are accessed from the fdw code using JNI. Advantages of switching to JDBC from CLI a. Future dev in this area will be easier. b. Connection pooling which is currently disabled can be enabled back. c. Different modes of authentication can be supported including LDAP. d. The new interface is more efficient and hence fast. e. Easy to support large objects. f. Easy to port to other OS.
3.2 In order to retrieve a column value memory was being allocated value = (char*) palloc(len); which is not needed any more, because memory has already been allocated in result set. The changed system returns pointer to the same allocated memory. This eliminates the need to first get the field length using hdfs_get_field_data_len and then getting the field value itself. This makes the system more efficient.
3.3 Error handling is improved. There is no need to allocated a fixed sized memory to store error messages any more.
3.4 HDFS_TINYINT data type was being handled differently, which is not needed any more.
3.5 The dependency on thrift and all related files are removed from the repository.
Build and usage instructions are as follows:
export JDK_INCLUDE=/home/edb/Projects/hadoop_fdw/jdk1.8.0_111/include/ export JVM_LIB=/home/edb/Projects/hadoop_fdw/jdk1.8.0_111/jre/lib/amd64/server export INSTALL_DIR=/usr/local/pg96/lib/postgresql/ export PATH=$PATH:/usr/local/pg96/bin/
cd hdfs_fdw make USE_PGXS=1 make USE_PGXS=1 install
cd libhive make make install
cd jdbc
javac MsgBuf.java javac HiveJdbcClient.java rm HiveJdbcClient-1.0.jar jar cf HiveJdbcClient-1.0.jar *.class cp HiveJdbcClient-1.0.jar /usr/local/pg96/lib/postgresql/
cd /usr/local/pg96/bin Add the following line in postgresql.conf (as a single line) hdfs_fdw.classpath='/usr/local/edb95/lib/postgresql/HiveJdbcClient-1.0.jar: /home/edb/Projects/hadoop_fdw/hadoop/share/hadoop/common/hadoop-common-2.6.4.jar: /home/edb/Projects/hadoop_fdw/apache-hive-1.0.1-bin/lib/hive-jdbc-1.0.1-standalone.jar'
then export LD_LIBRARY_PATH=/home/edb/Projects/hadoop_fdw/jdk1.8.0_111/jre/lib/amd64/server/:/usr/local/edb95/lib/postgresql/
and then run the server
./edb-postgres -D ../data -p 7777
then run client ./edb-psql postgres -p 7777
create extension hdfs_fdw;