OSError: Unable to get namenodes for default service

quiescentsam commented 5 years ago

Hi,

Traceback (most recent call last): File "", line 1, in File "/petastorm_venv3.6/lib/python3.6/site-packages/petastorm/reader.py", line 120, in make_reader resolver = FilesystemResolver(dataset_url, hdfs_driver=hdfs_driver) File "/petastorm_venv3.6/lib/python3.6/site-packages/petastorm/fs_utils.py", line 96, in init nameservice, namenodes = namenode_resolver.resolve_default_hdfs_service() File "/petastorm_venv3.6/lib/python3.6/site-packages/petastorm/hdfs/namenode.py", line 124, in resolve_default_hdfs_service .format(default_fs))) OSError: Unable to get namenodes for default service "hdfs://master:8020" from Hadoop path /opt/cloudera/parcels/CDH/lib/hadoop in environment variable HADOOP_HOME! Please check your hadoop configuration!

selitvin commented 5 years ago

Can you please confirm that your HADOOP_HOME environment variable points to a valid hadoop installation directory, specifically that $HADOOP_HOME/etc/hadoop/hdfs-site.xml and $HADOOP_HOME/etc/hadoop/core-site.xml are valid. Does using hdfs dfs -ls / works for you from the command line (and $HADOOP_HOME is set to the same value as when you are running your python program that uses petastorm)?

filipski commented 4 years ago

I face the same issue now. My HADOOP_HOME is set correctly:

$ echo $HADOOP_HOME
/usr/local/hadoop

I have /usr/local/hadoop/etc/hadoop/hdfs-site.xml configured as well. I set HADOOP_CONF_DIR and SPARK_DIST_CLASSPATH in /usr/local/spark/conf/spark-env.sh as follows:

export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop/
export SPARK_DIST_CLASSPATH=$(hadoop classpath)

And surely hdfs dfs -ls / works fine:

$ hdfs dfs -ls /
Found 2 items
drwxrwx---   - hduser hadoop              0 2020-01-23 09:04 /tmp
drwxr-xr-x   - hduser supergroup          0 2020-01-16 07:52 /user

I tried to store the 'hello world' dataset on HDFS, simply by changing the output_url in https://github.com/uber/petastorm/blob/e8b9f74c8db63f74c2f3b1658829089ee2d2ccdf/examples/hello_world/petastorm_dataset/generate_petastorm_dataset.py#L43 to:

def generate_petastorm_dataset(output_url='hdfs:///tmp/hello_world_dataset'):

As you see from the hdfs dfs -ls / above, /tmp exists on the HDFS and has correct access rights to the hadoop group which I'm using to start generate_petastorm_dataset.py.

What else am I missing?

filipski commented 4 years ago

OK, I took a closer look at: https://github.com/uber/petastorm/blob/e8b9f74c8db63f74c2f3b1658829089ee2d2ccdf/petastorm/hdfs/namenode.py#L110 and consequently: https://github.com/uber/petastorm/blob/e8b9f74c8db63f74c2f3b1658829089ee2d2ccdf/petastorm/hdfs/namenode.py#L84

and it looks petastorm is coded to work on high-availability (HA) cluster only, as it requires non-empty list of namenodes from 'dfs.ha.namenodes.' Hadoop configuration.

My cluster is a simple sandbox installation with a single namenode. Do I have to configure HA cluster or is there a way to use petastorm on a simple cluster with just a single name node?

selitvin commented 4 years ago

Well, not on purpose. In our clouds we have only HA ones. I guess our options are:

feel free to poke at the code and see if you can make it work for your case as well
At some stage, I can try fixing this as well. I'll need to figure out how to reproduce your setup though.
Curious, as a temporary workaround, would setting dfs.ha.namenodes with a list of addresses that both point to the same local namenode work?

filipski commented 4 years ago

I actually went ahead and reconfigured the cluster into a proper HA one, as at the end we'd need it this way anyway. And I confirm that with that config everything works well. Actually, one thing worth mentioning and updating your documentation: storing to HDFS by default requires libhdfs3 and if it's missing one gets pretty cryptic exceptions, as you catch the one clearly saying that the lib is missing and raise your own. So, consider mentioning that dependency in the installation section of your documentation or make a dependency for automatic installation, if that makes sense.

selitvin commented 4 years ago

Thank you for the feedback. Will leave the ticket open to track documentation update and to improve error messages in this scenario.

msaisumanth commented 4 years ago

@selitvin I had the same problem. I am using a docker image which is running a HDFS cluster. I set the values for dfs.ha.namenodes but it doesn't change anything. Any ideas?

selitvin commented 4 years ago

The code will try to load configuration from $HADOOP_HOME/etc/hadoop/, $HADOOP_PREFIX/etc/hadoop/ and $HADOOP_INSTALL/etc/hadoop/ location ( in this order ). Is it possible that we don't find the right hdfs-site.xml, core-site.xml files?

uber / petastorm

OSError: Unable to get namenodes for default service #404