sequenceiq / docker-ambari

Docker image with Ambari
291 stars 200 forks source link

Ambari-Server 1.7.0-ea issues #19

Closed cyon666 closed 9 years ago

cyon666 commented 9 years ago

I'm following the steps declared in the blog for Ambari 1.7.0 early access, found here: http://blog.sequenceiq.com/blog/2014/09/05/apache-ambari-1-7-0-ea/

I've tried these steps on a couple of machines: Mac OSX - Mavericks CentOS 6.5 (AMI on AWS)

All images fail when following the outlined steps at the aforementioned page.

On all machines, the only outlined issue comes from Ambari-Shell progress monitor bar. The ambari-server seems to start fine, and the 'clustering' seems to happen successfully. However, once Ambari-Shell is launched and the default blueprint is set, along with the command "cluster build --blueprint multi-node-hdfs-yarn" is activated, I see immediately the report of:

Welcome to Ambari Shell. For command and param completion press TAB, for assistance type 'hint'. ambari-shell>blueprint defaults Default blueprints added ambari-shell>cluster build --blueprint multi-node-hdfs-yarn HOSTNAME STATE

amb1.mycorp.kom HEALTHY amb0.mycorp.kom UNKNOWN amb3.mycorp.kom HEALTHY amb2.mycorp.kom HEALTHY"

AMB0 is always showing as unhealthy. This is occurring on CentOS6.5. The next command that seemingly shows success is "cluster autoAssign", which declares amb0 as master (but yet is currently showing unknown status).

After that command, we move onto "cluster create --exitOnFinish true". This is when the "Installation" progress bar begins being displayed. After a short time, the progress bar immediately shows "FAILED" status. I'm not exactly sure why, either: screen shot 2014-11-13 at 10 58 07

This is also the case only colleagues machine (MacOSX - Yosemite): ambari17failure2

Any assistance on this is greatly appreciated! I'm doing nothing outside of what's defined in the blog as well.

keyki commented 9 years ago

Hi,

The Ambari repository points to a nightly build thus it's not stable. It is supposed to be a tryout version until it is not officially released we have no control over what is published. I suggest you to use the 1.6 version. 1.7 will come late november.

cyon666 commented 9 years ago

Hi,

Unfortunately the project I'm working with requires 1.7 - hence my excitement at finding the blog entry. Do you have any recommendations on where to begin looking, in order to try to figure this out? As in... log locations to try to see initially, etc.

Thanks for the response!

keyki commented 9 years ago

Hi,

Could you please login to the UI to see which service's installation failed? I'll try to fix/update the container.

keyki commented 9 years ago

In the meanwhile you can look around on the wiki https://cwiki.apache.org/confluence/display/AMBARI/Ambari they usually post the right repositories to use there.

cyon666 commented 9 years ago

Of course, see the below screenshot for overall service failures. If you need/want further detail please let me know! I'll dig around that wiki and see if I can find anything.

screen shot 2014-11-13 at 11 57 18

cyon666 commented 9 years ago

Here is one exception seen from App Timeline service:

App Timeline Service (YARN)

2014-11-13 12:05:17,181 - Execute['ls /var/run/hadoop-yarn/yarn/yarn-yarn-timelineserver.pid >/dev/null 2>&1 && ps cat /var/run/hadoop-yarn/yarn/yarn-yarn-timelineserver.pid >/dev/null 2>&1'] {'initial_wait': 5, 'not_if': 'ls /var/run/hadoop-yarn/yarn/yarn-yarn-timelineserver.pid >/dev/null 2>&1 && ps cat /var/run/hadoop-yarn/yarn/yarn-yarn-timelineserver.pid >/dev/null 2>&1', 'user': 'yarn'} 2014-11-13 12:05:22,228 - Error while executing command 'start': Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 122, in execute method(env) File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/YARN/package/scripts/application_timeline_server.py", line 42, in start service('timelineserver', action='start') File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/YARN/package/scripts/service.py", line 59, in service initial_wait=5 File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 148, in init self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 149, in run self.run_action(resource, action)

keyki commented 9 years ago

I opened a JIRA issue for the timeline server https://issues.apache.org/jira/browse/AMBARI-7578 but it should be fixed by now. I'll take a look at it.

cyon666 commented 9 years ago

Thanks. As I said, I've done nothing out of what was noted in the blog entry. If I can provide anything else, please just let me know (as in any log files from my side, any other attempts, etc).

matyix commented 9 years ago

We can reproduce these - nevertheless if we'll need your logs, etc will let you know. By any chance - if it's for public domain - can you let us know what are the 1.7 features you are building on - maybe there is another alternative/workaround in the meanwhile.

cyon666 commented 9 years ago

The stack framework matured in 1.7, which is the most important feature for us. 1.5 introduced support for custom stacks, and 1.6 carried it over but we continuously had problems in both versions. Obtaining ambari 1.7 stand-alone and working with that seemed much more promising than its predecessors in regards to this need. 1.7 is more stable for this use, and while still not perfect is fitting our needs well.

As a result, blueprint support for custom stacks is also important for us. HDP 2.2 plays a vital role with its support for Kafka as well.

That's the best overall info I can provide at the moment. Sorry for the delay in a response.

keyki commented 9 years ago

Hi Randy,

Unless you need the timeline server you can get rid of it from the blueprint and install a cluster without it. You can use the same ambari-functions: amb-start-cluster 2 amb-shell blueprint add --url https://raw.githubusercontent.com/sequenceiq/docker-ambari/1.7.0-ea/multi-node.json cluster build --blueprint multi-node-hdfs-yarn cluster autoAssign cluster create

cyon666 commented 9 years ago

Stripping out the culprit service did allow for a full deployment of Ambari 1.7-ea. I appreciate the responses from both of you. Keyki, thanks for taking the extra step to help me move forward.

I really appreciate ya'lls efforts!

cyon666 commented 9 years ago

Closing as a result of the work-around. Looking forward to the "official" release later this month / year.

keyki commented 9 years ago

1.7.0 is out, feel free to use it anyway you want. docker pull sequenceiq/ambari:1.7.0