ContinuumIO / libhdfs3-downstream

a native c/c++ hdfs client (downstream fork from apache-hawq)
Apache License 2.0
40 stars 54 forks source link

dfs.encrypt.data.transfer not performing SSL handsake #20

Open douggie opened 5 years ago

douggie commented 5 years ago

Hi,

When running a secure (kerberos) HDFS with SSL encryption, handshaking with the namenode is performed correctly and commands that only interact with the namenode i.e. hdfs dfs -ls work perfectly.

However when any communication directed to the datanodes to read/write blocks, it seems the HDFSLib3 libaray is no longer performing handshaking and therefore is not able to access the blocks on the data nodes.

The other thing we noticed was that the client hdfs-site.xml and cluster have a replication factor of 2 however when putting files the HDFSLib3 defaults the replication factor of 3 although it is reading the value of 2 from the hdfs-site.xml, and we have to explicitly set the replication factor 2. Could this be related to hdfs-site.xml not being correctly read and hence the dfs.encrypt.data.transfer is not picked up when reading/writing files?

Reading and writing files using the hdfs dfs command line works without issue.

Does HDFSLib3 support dfs.encrypt.data.transfer?
Is this a duplicate of https://github.com/dask/hdfs3/issues/146? The issue with using pyarrow is that the JNI wrapper leaks significant memory when running a large number of parallel jobs, where as in our testing libhdfs3 does not hence we are keen to try and use it.

What could cause HDFSLib3 to ignore the values in core-site.xml and hdfs-site.xml when reading/writing files? Code sample


hdfs = HDFileSystem(host=host, port=port, user=user,
                     ticket_cache=ticket_cache)

# this works
print(hdfs.ls('/'))

# read/write files fails to add/read blocks due to encryption error    
test_file = '/tmp/testfile.csv'

# hdfs.open with default replication=0 makes libhdfs3
# trys to find 3 replication nodes instead of 2 (the value in dfs.replication)
with hdfs.open(test_file, mode='wb', replication=2) as f:
    f.write(b'Name, Amount\nAlice, 10, John, 20') # this fails to write blocks

Log Excerpt

INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /xxx.xxx.xxx.xxx:47984. Perhaps the client is running an older version of Hadoop which does not support encryption

hdfs-site.xml settings

    <name>dfs.replication</name>
    <value>2</value>
  </property>
<property>
   <name>hadoop.security.authentication</name>
   <value>kerberos</value>
 </property>
 <property>
   <name>hadoop.security.authorization</name>
   <value>true</value>
 </property>
 <property>
   <name>hadoop.rpc.protection</name>
   <value>privacy</value>
 </property>
  <property>
    <name>dfs.encrypt.data.transfer</name>
    <value>true</value>
  </property>

  <property>
    <name>dfs.block.access.token.enable</name>
    <value>true</value>
  </property>

  <property>
    <name>dfs.encrypt.data.transfer.algorithm</name>
    <value></value> <!-- leave empty for AES -->
  </property>

  <property>
    <name>dfs.encrypt.data.transfer.cipher.suites</name>
    <value>AES/CTR/NoPadding</value>
  </property>

  <property>
    <name> dfs.encrypt.data.transfer.cipher.key.bitlength</name>
    <value>256</value> <!-- can also be set to 128 or 192 -->
  </property>
  <property>
    <name>dfs.webhdfs.enabled</name>
    <value>true</value>
  </property>

  <property>
    <name>dfs.https.enable</name>
    <value>true</value>
  </property>

  <property>
    <name>dfs.http.policy</name>
    <value>HTTPS_ONLY</value>
  </property>