alagrede / HdfsClient

A Java Hdfs client example and full Kerberos example for call hadoop commands directly in java code or on your local machine.
12 stars 9 forks source link
hadoop hadoop-hdfs kerberos kerberos-authentication

HDFS client

This library allow to connect to the Hadoop datalab cluster without any system installation (except Java).

HDFS in Java application

<dependency>
    <groupId>com.tony.hdfs</groupId>
    <artifactId>HdfsClient</artifactId>
    <version>1.0</version>
</dependency>

Configuration

Define your hadoop.properties in your project

hadoop.cluster=clustername
hadoop.failoverProxy=org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
hadoop.namenodes=nn1,nn2
hadoop.rpcAddress=[DNS_NAMENODE1]:[PORT_RPC],[DNS_NAMENODE2]:[PORT_RPC]
hadoop.httpAddress=[DNS_NAMENODE1]:[PORT_HTTP],[DNS_NAMENODE2]:[PORT_HTTP]
hadoop.krb5Url=hadoop/krb5.conf
hadoop.jaasConfUrl=hadoop/jaas.conf

Note: Example krb5.conf and jaas.conf are embedded in jar and must be overriden

Usage

Properties prop = new Properties();

ClassLoader classLoader = getClass().getClassLoader();

InputStream input = new FileInputStream("./hadoop.properties");
prop.load(input);

client = new HadoopClient();

client.setHadoopCluster(prop.getProperty("hadoop.cluster"));
client.setNamenodes(prop.getProperty("hadoop.namenodes"));
client.setHttpAaddress(prop.getProperty("hadoop.httpAddress"));
client.setRpcAddress(prop.getProperty("hadoop.rpcAddress"));
client.setHadoopProxy(prop.getProperty("hadoop.failoverProxy"));

# For use internal krb5 and jaas files
URL jaas = classLoader.getResource(prop.getProperty("hadoop.jaasConfUrl"));
URL krb5 = classLoader.getResource(prop.getProperty("hadoop.krb5Url"));

# For use external krb5 and jaas files
#URL jaas = new File(prop.getProperty("hadoop.jaasConfUrl")).toURL();
#URL krb5 = new File(prop.getProperty("hadoop.krb5Url")).toURL();

client.setJaasConfUrl(jaas);
client.setKrbConfUrl(krb5);

String keytabPath = new File("xxx.keytab").getPath();

FileSystem fs = client.hadoopConnectionWithKeytab(keytabPath, "xxx@xxx.CORP");

// or with user/password
FileSystem fs = client.hadoopConnectionWithUserPassword("xxx@xxx.CORP", "xxx");

Command line Interface

The project provide a fat jar with the original Hadoop client hdfs dfs interface usable on your local machine.

Configuration

Copy next to the hadoop-client-cli.jar :

Example hadoop.properties for CLI with keytab

hadoop.cluster=clustername
hadoop.failoverProxy=org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
hadoop.namenodes=nn1,nn2
hadoop.rpcAddress=[DNS_NAMENODE1]:[PORT_RPC],[DNS_NAMENODE2]:[PORT_RPC]
hadoop.httpAddress=[DNS_NAMENODE1]:[PORT_HTTP],[DNS_NAMENODE2]:[PORT_HTTP]
hadoop.krb5Url=krb5.conf
hadoop.jaasConfUrl=jaas.conf

#Keytab auth
hadoop.keytab=xxx.keytab
hadoop.principal=xxx@xxx.CORP
#hadoop.password=XXX

Example hadoop.properties for CLI with user/pass authentication

#User/pass auth
#hadoop.keytab=xxx.keytab
hadoop.principal=xx@xxx.CORP
hadoop.password=XXX

jaas.conf

HdfsHaSample {
  com.sun.security.auth.module.Krb5LoginModule required client=TRUE debug=true;
};

Usage

java -jar hadoop-client-cli.jar -ls /

Deploy CLI

Deploy jar and files in %userprofile%/hdfs and add directory to Windows PATH.

Add hdfs.bat

@ECHO OFF

setlocal
cd /d %~dp0
java -jar %userprofile%/hdfs/hadoop-client-cli.jar %*

Usage in cmd:

hdfs -ls /