Closed dbolshak closed 7 years ago
Hi @dbolshak
When launching your job, the Spark Dispatcher will fetch your jar via the Mesos Fetcher: http://mesos.apache.org/documentation/latest/fetcher/
By default, the Mesos Fetcher will fetch hdfs://
urls by shelling out to the hadoop
binary on the machine. Is hadoop
installed? If so, is it working properly? Is it configured with core-site.xml
and hdfs-site.xml
? Can you use it to fetch your jobo manually?
cc @susanxhuynh @ArtRand
Hello @mgummelt ,
Thanks for quick response and sorry for my delayed one.
Hadoop binaries are installed on all agents, but it's not configured. So the default hadoop configuration (/etc/hadoop/) is untouched. This directory has core-site.xml and hdfs-site.xml.
Also I see that there is HADOOP_CONF_DIR
evn var, that points to /etc/hadoop
.
Of course, using such configuration it's not possible to get access to real hdfs, but it can not be a cause of core-site.xml not found
.
And I would insist, that it should be possible to run Spark job without properly configured HDFS on agents (core-site.xml and hdfs-site.xml), because behaving by such way does not allow to run several independent HDFS services. Allowing to run only single HDFS service is a huge limitation, it means that mesos does not support multi tenancy, it's not possible to use single mesos to manage different environments (for example production and development).
About running job manually. Spark and HDFS work fine, fetching config files manually and running job manually works fine.
@susanxhuynh and @ArtRand , could you please join to this discussion? I still think that there is a problem somewhere.
Kind regards, Denis
@dbolshak Since this is an issue with the Mesos Fetcher, you'll have better luck asking for help on the DC/OS mailing lists. They can help you get your hdfs configured properly.
I submit my spark job as
Of course, fullQuilifiedClassName and hdfs://path/my.jar refer to real values.
But job fails while fetching resources with erorr
At the same time one of mesos agents has the folloing logs
Looking at the first line
I can assume that hdfs confg files should fetched before contacting with HDFS cluster.
Spark version 2.1.1.