Open F21 opened 9 years ago
Hello! You need a minimum of 4 slaves @F21. Try adding slaves and it should work fine. The reason is we run HDFS in HA mode with 2 NN's and we need multiple hosts for a production set up to make sure we are okay if specific machines go down. See further documentation in the README and on HDFS HA mode.
@elingg Thanks for your help!
I now have 4 nodes with 3 slaves and 1 master + slave. After building HEAD, I ran ./bin/hdfs-mesos
on my master node to launch.
I now see 3 journal nodes being launch, but name nodes and any other nodes are not being launched.
I now receive errors because journalnode3.hdfs.mesos
could not be resolved:
03:24:48.499 [Thread-220] INFO org.apache.mesos.hdfs.Scheduler - Received 1 offers
03:24:48.499 [Thread-220] INFO org.apache.mesos.hdfs.Scheduler - Resolving DNS for journalnode3.hdfs.mesos
03:24:48.499 [Thread-220] WARN org.apache.mesos.hdfs.Scheduler - Couldn't resolve host journalnode3.hdfs.mesos
I am using mesos-dns
and it works fine. If I go to one of my nodes and do nslookup master.mesos
or nslookup dns.marathon.mesos
, it works fine, but nslookup journalnode3.hdfs.mesos
returns an error.
This is the current state of my cluster:
And this is my mesos-site.xml
:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mesos.hdfs.data.dir</name>
<description>The primary data directory in HDFS</description>
<value>/tmp/hdfs/data</value>
</property>
<property>
<name>mesos.hdfs.secondary.data.dir</name>
<description>The secondary data directory in HDFS</description>
<value>/var/run/hadoop-hdfs</value>
</property>
<property>
<name>mesos.hdfs.native-hadoop-binaries</name>
<description>Mark true if you have hadoop pre-installed on your host machines (otherwise it will be distributed by the scheduler)</description>
<value>false</value>
</property>
<property>
<name>mesos.hdfs.framework.mnt.path</name>
<description>Mount location (if mesos.hdfs.native-hadoop-binaries is marked false)</description>
<value>/opt/mesosphere</value>
</property>
<property>
<name>mesos.hdfs.state.zk</name>
<description>Comma-separated hostname-port pairs of zookeeper node locations for HDFS framework state information</description>
<value>master.mesos:2181</value>
</property>
<property>
<name>mesos.master.uri</name>
<description>Zookeeper entry for mesos master location</description>
<value>zk://master.mesos:2181/mesos</value>
</property>
<property>
<name>mesos.hdfs.zkfc.ha.zookeeper.quorum</name>
<description>Comma-separated list of zookeeper hostname-port pairs for HDFS HA features</description>
<value>master.mesos:2181</value>
</property>
<property>
<name>mesos.hdfs.framework.name</name>
<description>Your Mesos framework name and cluster name when accessing files (hdfs://YOUR_NAME)</description>
<value>hdfs</value>
</property>
<property>
<name>mesos.hdfs.mesosdns</name>
<description>Whether to use Mesos DNS for service discovery within HDFS</description>
<value>true</value>
</property>
<property>
<name>mesos.hdfs.mesosdns.domain</name>
<description>Root domain name of Mesos DNS (usually 'mesos')</description>
<value>mesos</value>
</property>
<property>
<name>mesos.native.library</name>
<description>Location of libmesos.so</description>
<value>/usr/local/lib/libmesos.so</value>
</property>
<property>
<name>mesos.hdfs.journalnode.count</name>
<description>Number of journal nodes (must be odd)</description>
<value>3</value>
</property>
<!-- Additional settings for fine-tuning -->
<property>
<name>mesos.hdfs.jvm.overhead</name>
<description>Multiplier on resources reserved in order to account for JVM allocation</description>
<value>1.35</value>
</property>
<property>
<name>mesos.hdfs.hadoop.heap.size</name>
<value>256</value>
</property>
<property>
<name>mesos.hdfs.namenode.heap.size</name>
<value>512</value>
</property>
<property>
<name>mesos.hdfs.datanode.heap.size</name>
<value>256</value>
</property>
<property>
<name>mesos.hdfs.executor.heap.size</name>
<value>256</value>
</property>
<property>
<name>mesos.hdfs.executor.cpus</name>
<value>0.5</value>
</property>
<property>
<name>mesos.hdfs.namenode.cpus</name>
<value>0.4</value>
</property>
<property>
<name>mesos.hdfs.journalnode.cpus</name>
<value>0.4</value>
</property>
<property>
<name>mesos.hdfs.datanode.cpus</name>
<value>0.4</value>
</property>
<property>
<name>mesos.hdfs.user</name>
<value>root</value>
</property>
<property>
<name>mesos.hdfs.role</name>
<value>*</value>
</property>
</configuration>
Similar situations happened to me as well.
For, invalid lookup, it's caused by invalid mappings of mesos-dns.
VERY VERBOSE: 2015/06/26 12:46:49 generator.go:460: [A] journalnode1.hdfs.mesos.: server01.myhome.local
I'm not sure who(mesos-dns or mesos-hdfs) did the mistake, but anyway, this issue will eventually solved after few minutes. (But for sure, I set hostname of slaves as IP addr, to solve below issues, but it's not working)
After @F21 pass this, then you'll encounter these errors
We need to coloate the namenode with a journalnode and there isno journalnode running on this host.
The reason looks like, lacking of checking real hostname & hostname from mesos-dns in here(https://github.com/mesosphere/hdfs/blob/master/hdfs-scheduler/src/main/java/org/apache/mesos/hdfs/state/PersistentState.java#L289) or offer should be used hostnames from mesos-dns.
I'll remove mesos-dns to avoid these problems, but I can't sure :(
After some tweaking, I added all my mesos nodes and ip addresses to the /etc/hosts
of all my mesos nodes.
The DNS now resolves correctly and 3 journal nodes are started, but I am still getting a not enough resources error even with 4 nodes when trying to launch the other nodes:
06:42:44.250 [Thread-102] INFO org.apache.mesos.hdfs.Scheduler - Offer does not have enough resources
06:42:44.250 [Thread-102] INFO org.apache.mesos.hdfs.Scheduler - Offer does not have enough resources
06:42:47.261 [Thread-103] INFO org.apache.mesos.hdfs.Scheduler - Received 2 offers
06:42:47.261 [Thread-103] INFO org.apache.mesos.hdfs.Scheduler - Resolving DNS for journalnode3.hdfs.mesos
06:42:47.262 [Thread-103] INFO org.apache.mesos.hdfs.Scheduler - Successfully found journalnode3.hdfs.mesos
06:42:47.262 [Thread-103] INFO org.apache.mesos.hdfs.Scheduler - Resolving DNS for journalnode1.hdfs.mesos
06:42:47.262 [Thread-103] INFO org.apache.mesos.hdfs.Scheduler - Successfully found journalnode1.hdfs.mesos
06:42:47.262 [Thread-103] INFO org.apache.mesos.hdfs.Scheduler - Resolving DNS for journalnode2.hdfs.mesos
06:42:47.262 [Thread-103] INFO org.apache.mesos.hdfs.Scheduler - Successfully found journalnode2.hdfs.mesos
06:42:47.278 [Thread-103] INFO org.apache.mesos.hdfs.Scheduler - We need to coloate the namenode with a journalnode and there isno journalnode running on this host. mesos-slave-03
06:42:47.278 [Thread-103] INFO org.apache.mesos.hdfs.Scheduler - Offer does not have enough resources
06:42:49.267 [Thread-104] INFO org.apache.mesos.hdfs.Scheduler - Received 2 offers
06:42:49.267 [Thread-104] INFO org.apache.mesos.hdfs.Scheduler - Resolving DNS for journalnode3.hdfs.mesos
06:42:49.267 [Thread-104] INFO org.apache.mesos.hdfs.Scheduler - Successfully found journalnode3.hdfs.mesos
06:42:49.267 [Thread-104] INFO org.apache.mesos.hdfs.Scheduler - Resolving DNS for journalnode1.hdfs.mesos
06:42:49.267 [Thread-104] INFO org.apache.mesos.hdfs.Scheduler - Successfully found journalnode1.hdfs.mesos
06:42:49.267 [Thread-104] INFO org.apache.mesos.hdfs.Scheduler - Resolving DNS for journalnode2.hdfs.mesos
06:42:49.268 [Thread-104] INFO org.apache.mesos.hdfs.Scheduler - Successfully found journalnode2.hdfs.mesos
06:42:49.268 [Thread-104] INFO org.apache.mesos.hdfs.Scheduler - Offer does not have enough resources
06:42:49.268 [Thread-104] INFO org.apache.mesos.hdfs.Scheduler - Offer does not have enough resources
@F21 This is exactly same issue I mentioned before! But, I still don't know how to fix this. :sob:
@iryoung I reduced mesos.hdfs.jvm.overhead
in mesos-site.xml
to a low number like 0.4
and the name nodes started launching.
I now have 2 name nodes, 2 zkfc and 3 journal nodes, but I still get Offer does not have enough resources
and no data nodes.
@F21 Ah,, Thanks! After lower the value of mesos.hdfs.jvm.overhead
, NN launched!
Therefore, the real problem is that though there's enough resource to launch in the JN/ZFC slaves(for my case, CPU is 8), mesos-hdfs try to launch other slaves which doesn't have JN, and keep failing.... :(
Now I got another problems...
15/06/26 19:14:32 ERROR namenode.FSImage: Unable to save image for /home/www/data/hdfs/name
java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z
at org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy(Native Method)
failed to restart with these logs
15/06/26 19:14:44 WARN namenode.FSNamesystem: Encountered exception loading fsimage
java.io.FileNotFoundException: No valid image files found
at org.apache.hadoop.hdfs.server.namenode.FSImageTransactionalStorageInspector.getLatestImages(FSImageTransactionalStorageInspector.java:165)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:609)
Exception in thread "main" java.net.BindException: Problem binding to [CN101017502.line.ism:8019] java.net.BindException: Address already in use; For more details see: http://wiki.apache.org/hadoop/BindException
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:719)
at org.apache.hadoop.ipc.Server.bind(Server.java:419)
at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:561)
Yes, Mesos DNS takes about a minute to resolve. Either you adjust the settings or just wait patiently. :smile:
@F21, you need to have adequate resources, which is the error message you are describing. You need a minimum of 4 slave machines. You can lower your resources further in mesos-site.xml, but keep in mind this means you won't be able to store very much data.
@iryoung, it sounds like your data directories got corrupted after multiple launches and relaunches. Could you try following the steps for uninstall and reinstalling?
@F21, I have a few question for you:
Thank you in advance for time and help.
hi @kcao3,I know you didn't ping me, but I will do my best to help with understanding here as well:
1) One one slave you have NN1 machine (JN/NN/ZKFC), on another slave you have NN2 machine (JJ/NN/ZKFC), on another slave you have a JN, and on another slave you have a DN. If you launch more than 4 slaves, you will get more DN's. There is an open issue to make number of DN's configurable.
2) Your nodes will only launch on Mesos slaves, not on the master.
3) You can run HDFS on Mesos currently without Mesos DNS. It is an option, currently.
Hi @elingg, Thank you so much for helping me understand how Mesos HA HDFS actually works at the architecture level. As a follow-up from question 1, one the NN1 machine where the three tasks JN/NN/ZKFC are running, it appears to me that these tasks are currently running in the same environment sharing with the host NN1 machine? Is there currently a way to configure HA HDFS such that these tasks when started can run in their own environment (such as inside Linux containers), which are isolated from the host OS machine? I think this is one of the key feature of Mesos overall.
hello @kcao3, no problem at all! Those 3 tasks are running inside their own environment (Native Linux containers). The only modification of the host environment is currently through sym links if the binaries are not predistributed as well as the data directoiesy. We have plans to integrate with Persistent Volumes in Mesos, and there are plans of adding storage drivers into Mesos also (to avoid these modifications to the host environment).
@elingg, I am glad to get confirmation from you that those 3 tasks indeed run inside the Native Linux Containers. However, I am not 100% confident that they are running in their own environment completely isolated from the host machine. I have 2 reasons for doubt:
In general, we are using the native linux cgroups for isolation and containerization. However, there are a few modifications to the host environment as mentioned. As far as running the scheduler and executor in docker, this is actually in the works. We want to support running in the native linux containers or in docker, as well.
Hi all,
I too am struggling with an issue related to the co-location of NN/ZKFC and JN tasks. Just as @F21 experienced, I sometimes find my NN/ZKFC will not start and receive Offer does not have enough resources
in the logs.
In my case, this is happening because JNs are being placed on slaves without enough CPU/RAM to run the NN/ZKFC. Consequently, the scheduler will hang in START_NAME_NODES
phase awaiting more resources to become available on 2 of the slaves where the JNs are.
In smaller Mesos clusters comprised of homogeneous machines, we might be able to avoid this by simply configuring mesos-site.xml CPU/RAM requirements carefully, but in bigger production Mesos clusters comprised of a heterogeneous mixture of machine types, we won't be able to plan ahead. There still exists the situation where an underpowered slave gets chosen for the JN and eventually NN/ZKFC tasks.
To fix this problem, I'd suggest that we modify the logic that validates an offer's resources during JOURNAL_NODES
phase to take into account the resources that will be needed for the NN/ZKFC. If the resources aren't present, the offer should get declined and the JN never launched on that slave.
Hi @LLParse, I agree with your assessment. We should check resources before launching the JN's to account for this.
@elingg @LLParse I just encountered this on a testing cluster. The namenodes are being starved of offers on the mesos agents that the journalnodes were placed upon. They'll have to be placed together within the context of the same offer, use static reservations, use dynamic reservations, or some other solution i'm unaware of. Which would be the preferred route? I'm going to either overprovision my testing cluster or just dive in and fix this problem. I'd prefer to write a solution that has a good chance of being merged.
Sounds awesome @knuckolls. There should be a resource check when launching them to make sure that 2/3 journal nodes have space for the NN's to colocate on them.
I have a 3 node mesos setup with 1 master and 2 slaves running on Ubuntu 14.10 64-bit.
I compiled the project and uploaded the archive to my master and extracted it.
I modified my
mesos-site.xml
as follows:This is my
hdfs-site.xml
where I set it to only run 1 name node:I then launch it by calling
sudo ./bin/hdfs-mesos
.The journal node seems to launch fine and a namenode is also launched. However I am not seeing any data nodes and I get a message saying the namenode should be colocated with the journal node:
Here's the tasks running when viewing the mesos web interface:
Why is it complaining about colocating the namenode with the journalnode? They all seem to be launched on
mesos-master-01
. Also why is the datanode not being launched?All mesos slaves and masters have 2GB of ram, 4 cpus and 34.3GB of disk.