tensorflow / tensor2tensor

Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
Apache License 2.0
15.58k stars 3.51k forks source link

how to use hdfs in distributed t2t #1641

Open colmantse opened 5 years ago

colmantse commented 5 years ago

I am working on distributed t2t and i managed to set up to run in nfs environment but when i change to hdfs path, it gives the log as attached. I would like to confirm if setting environment variables and changing to hdfs is enough for t2t to use hdfs or if i am missing anything more.

referenced: https://github.com/tensorflow/tensorflow/issues/30981#

util.NativeCodeLoader: Unable to load native-hadoop library for your platform info retry.RetryInvocationHandler org.apache.hadoop.net.ConnectTimeoutException: Call From ... failed on socket timeout exception: org.apache.hadoop.net.ConnectTimeoutException: 20000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=...] while invoking ClientNameNodeProtocolTranslatorPB.getFileinfo over ... after 1 failover attempts. Trying to failover after sleeping for 781ms

colmantse commented 5 years ago

umm.... is there any successful use case of using hdfs in t2t that i might follow? i solved the unable to load error but the socket timeout problem persists...