Open vathanlal opened 7 years ago
Hello @vathanlal Mesos 1.0.0 should be working fine with Myriad 0.2 and at least Hadoop 2.7.0+ Could you paste part of the log here that is relevant to resources being declined, I can only see portion that is relevant to Web
Hai @yufeldman
Part of my mesos log is as shown below. As in the log after sending offer Mesos is getting the decline for offer from Myriad. And in my yarn-root-resourcemanager-mesos.log there is nothing related to this error would you also need that??
I1017 17:03:52.125422 1605 master.cpp:5709] Sending 2 offers to framework 6215a35e-749e-4f27-bb50-f7c01650da80-0006 (MyriadAlpha) at scheduler-f83a7daa-15ec-402e-b004-18e88a9dc3b7@10.0.2.19:51076 I1017 17:03:52.126741 1606 master.cpp:3951] Processing DECLINE call for offers: [ 6215a35e-749e-4f27-bb50-f7c01650da80-O19737 ] for framework 6215a35e-749e-4f27-bb50-f7c01650da80-0006 (MyriadAlpha) at scheduler-f83a7daa-15ec-402e-b004-18e88a9dc3b7@10.0.2.19:51076 I1017 17:03:52.127111 1603 master.cpp:3951] Processing DECLINE call for offers: [ 6215a35e-749e-4f27-bb50-f7c01650da80-O19738 ] for framework 6215a35e-749e-4f27-bb50-f7c01650da80-0006 (MyriadAlpha) at scheduler-f83a7daa-15ec-402e-b004-18e88a9dc3b7@10.0.2.19:51076 I1017 17:03:53.132866 1608 master.cpp:5709] Sending 1 offers to framework d0921eb7-2bbf-4cf8-8ffd-a4c0b0146289-0000 (chronos-2.4.0) at scheduler-2ab2d850-3c91-47d1-aa3d-dcfc7bc420fe@10.0.2.19:40076 I1017 17:03:53.134438 1602 master.cpp:3951] Processing DECLINE call for offers: [ 6215a35e-749e-4f27-bb50-f7c01650da80-O19739 ] for framework d0921eb7-2bbf-4cf8-8ffd-a4c0b0146289-0000 (chronos-2.4.0) at scheduler-2ab2d850-3c91-47d1-aa3d-dcfc7bc420fe@10.0.2.19:40076 I1017 17:03:55.144173 1604 master.cpp:5709] Sending 1 offers to framework d0921eb7-2bbf-4cf8-8ffd-a4c0b0146289-0000 (chronos-2.4.0) at scheduler-2ab2d850-3c91-47d1-aa3d-dcfc7bc420fe@10.0.2.19:40076 I1017 17:03:55.145689 1603 master.cpp:3951] Processing DECLINE call for offers: [ 6215a35e-749e-4f27-bb50-f7c01650da80-O19740 ] for framework d0921eb7-2bbf-4cf8-8ffd-a4c0b0146289-0000 (chronos-2.4.0) at scheduler-2ab2d850-3c91-47d1-aa3d-dcfc7bc420fe@10.0.2.19:40076 I1017 17:03:55.369249 1608 http.cpp:381] HTTP GET for /master/state from 10.0.2.19:55869 with User-Agent='Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:31.0) Gecko/20100101 Firefox/31.0' I1017 17:03:57.146108 1603 master.cpp:5709] Sending 2 offers to framework 6215a35e-749e-4f27-bb50-f7c01650da80-0006 (MyriadAlpha) at scheduler-f83a7daa-15ec-402e-b004-18e88a9dc3b7@10.0.2.19:51076 I1017 17:03:57.147450 1609 master.cpp:3951] Processing DECLINE call for offers: [ 6215a35e-749e-4f27-bb50-f7c01650da80-O19741 ] for framework 6215a35e-749e-4f27-bb50-f7c01650da80-0006 (MyriadAlpha) at scheduler-f83a7daa-15ec-402e-b004-18e88a9dc3b7@10.0.2.19:51076 I1017 17:03:57.147749 1608 master.cpp:3951] Processing DECLINE call for offers: [ 6215a35e-749e-4f27-bb50-f7c01650da80-O19742 ] for framework 6215a35e-749e-4f27-bb50-f7c01650da80-0006 (MyriadAlpha) at scheduler-f83a7daa-15ec-402e-b004-18e88a9dc3b7@10.0.2.19:51076 I1017 17:03:58.146752 1603 master.cpp:5709] Sending 1 offers to framework d0921eb7-2bbf-4cf8-8ffd-a4c0b0146289-0000 (chronos-2.4.0) at scheduler-2ab2d850-3c91-47d1-aa3d-dcfc7bc420fe@10.0.2.19:40076 I1017 17:03:58.148146 1609 master.cpp:3951] Processing DECLINE call for offers: [ 6215a35e-749e-4f27-bb50-f7c01650da80-O19743 ] for framework d0921eb7-2bbf-4cf8-8ffd-a4c0b0146289-0000 (chronos-2.4.0) at scheduler-2ab2d850-3c91-47d1-aa3d-dcfc7bc420fe@10.0.2.19:40076 I1017 17:04:00.152983 1605 master.cpp:5709] Sending 1 offers to framework d0921eb7-2bbf-4cf8-8ffd-a4c0b0146289-0000 (chronos-2.4.0) at scheduler-2ab2d850-3c91-47d1-aa3d-dcfc7bc420fe@10.0.2.19:40076 I1017 17:04:00.154595 1608 master.cpp:3951] Processing DECLINE call for offers: [ 6215a35e-749e-4f27-bb50-f7c01650da80-O19744 ] for framework d0921eb7-2bbf-4cf8-8ffd-a4c0b0146289-0000 (chronos-2.4.0) at scheduler-2ab2d850-3c91-47d1-aa3d-dcfc7bc420fe@10.0.2.19:40076 I1017 17:04:02.157223 1605 master.cpp:5709] Sending 2 offers to framework 6215a35e-749e-4f27-bb50-f7c01650da80-0006 (MyriadAlpha) at scheduler-f83a7daa-15ec-402e-b004-18e88a9dc3b7@10.0.2.19:51076 I1017 17:04:02.159744 1608 master.cpp:3951] Processing DECLINE call for offers: [ 6215a35e-749e-4f27-bb50-f7c01650da80-O19745 ] for framework 6215a35e-749e-4f27-bb50-f7c01650da80-0006 (MyriadAlpha) at scheduler-f83a7daa-15ec-402e-b004-18e88a9dc3b7@10.0.2.19:51076 I1017 17:04:02.160132 1604 master.cpp:3951] Processing DECLINE call for offers: [ 6215a35e-749e-4f27-bb50-f7c01650da80-O19746 ] for framework 6215a35e-749e-4f27-bb50-f7c01650da80-0006 (MyriadAlpha) at scheduler-f83a7daa-15ec-402e-b004-18e88a9dc3b7@10.0.2.19:51076
@vathanlal
Do you see from Mesos console that NM tries to start and fails? RM is usually very chatty about offers received. Is RM even started properly? Can you see RM UI (Not just Myriad UI)
@yufeldman
No NM is not showing in the Mesos Console. When I started
./yarn-daemon.sh start resourcemanager
Myriad framework is showing in the Mesos Console also jps shows resourcemanager in my command line. Iam also getting the UI in http://10.0.2.19:8088 but no nodes are showing in the cluster. My cluster info is like this in UI
`Cluster ID: 1476713515568
ResourceManager state: STARTED
ResourceManager HA state: active
ResourceManager HA zookeeper connection state: ResourceManager HA is not enabled.
ResourceManager RMStateStore: org.apache.hadoop.yarn.server.resourcemanager.recovery.NullRMStateStore
ResourceManager started on: Mon Oct 17 16:11:55 +0200 2016
ResourceManager version: 2.7.2 from b165c4fe8a74265c792ce23f546c64604acf0e41 by jenkins source checksum c63f7cc71b8f63249e35126f0f7492d on 2016-01-26T00:16Z
Hadoop version: 2.7.2 from b165c4fe8a74265c792ce23f546c64604acf0e41 by jenkins source checksum d0fda26633fa762bff87ec759ebe689c on 2016-01-26T00:08Z `
@vathanlal Since you are starting RM manually here, I expect it's logs to be in standard yarn logs directory - both .log and .out Can you look through them?
@yufeldman
Yes I have that two files in my yarn logs directory. Iam getting following exception in my yarn-root-resourcemanager-mesos.out file
`INFO: Couldn't find JAX-B element for class org.apache.myriad.api.model.FlexDownClusterRequest Oct 17, 2016 6:12:30 PM com.sun.jersey.server.wadl.generators.WadlGeneratorJAXBGrammarGenerator$8 resolve SEVERE: null java.lang.IllegalAccessException: Class com.sun.jersey.server.wadl.generators.WadlGeneratorJAXBGrammarGenerator$8 can not access a member of class javax.ws.rs.core.Response with modifiers "protected" at sun.reflect.Reflection.ensureMemberAccess(Reflection.java:102) at java.lang.Class.newInstance(Class.java:436) at com.sun.jersey.server.wadl.generators.WadlGeneratorJAXBGrammarGenerator$8.resolve(WadlGeneratorJAXBGrammarGenerator.java:467) at com.sun.jersey.server.wadl.WadlGenerator$ExternalGrammarDefinition.resolve(WadlGenerator.java:181) at com.sun.jersey.server.wadl.ApplicationDescription.resolve(ApplicationDescription.java:81) at com.sun.jersey.server.wadl.generators.WadlGeneratorJAXBGrammarGenerator.attachTypes(WadlGeneratorJAXBGrammarGenerator.java:518) at com.sun.jersey.server.wadl.WadlBuilder.generate(WadlBuilder.java:124) at com.sun.jersey.server.impl.wadl.WadlApplicationContextImpl.getApplication(WadlApplicationContextImpl.java:104) at com.sun.jersey.server.impl.wadl.WadlResource.getWadl(WadlResource.java:89) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60) at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205) at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75) at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288) at com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108) at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84) at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1469) at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1400) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1349) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1339) at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416) at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:537) at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:699) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62) at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118) at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
Oct 17, 2016 6:12:30 PM com.sun.jersey.server.wadl.generators.WadlGeneratorJAXBGrammarGenerator attachTypes INFO: Couldn't find JAX-B element for class javax.ws.rs.core.Response Oct 17, 2016 6:12:30 PM com.sun.jersey.server.wadl.generators.WadlGeneratorJAXBGrammarGenerator attachTypes INFO: Couldn't find JAX-B element for class org.apache.myriad.api.model.FlexDownServiceRequest Oct 17, 2016 6:12:30 PM com.sun.jersey.server.wadl.generators.WadlGeneratorJAXBGrammarGenerator$8 resolve SEVERE: null java.lang.IllegalAccessException: Class com.sun.jersey.server.wadl.generators.WadlGeneratorJAXBGrammarGenerator$8 can not access a member of class javax.ws.rs.core.Response with modifiers "protected" at sun.reflect.Reflection.ensureMemberAccess(Reflection.java:102) at java.lang.Class.newInstance(Class.java:436) at com.sun.jersey.server.wadl.generators.WadlGeneratorJAXBGrammarGenerator$8.resolve(WadlGeneratorJAXBGrammarGenerator.java:467) at com.sun.jersey.server.wadl.WadlGenerator$ExternalGrammarDefinition.resolve(WadlGenerator.java:181) at com.sun.jersey.server.wadl.ApplicationDescription.resolve(ApplicationDescription.java:81) at com.sun.jersey.server.wadl.generators.WadlGeneratorJAXBGrammarGenerator.attachTypes(WadlGeneratorJAXBGrammarGenerator.java:518) at com.sun.jersey.server.wadl.WadlBuilder.generate(WadlBuilder.java:124) at com.sun.jersey.server.impl.wadl.WadlApplicationContextImpl.getApplication(WadlApplicationContextImpl.java:104) at com.sun.jersey.server.impl.wadl.WadlResource.getWadl(WadlResource.java:89) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60) at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205) at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75) at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288) at com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108) at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84) at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1469) at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1400) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1349) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1339) at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416) at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:537) at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:699) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62) at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118) at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)`
@yufeldman
In my .log file iam getting the warning "2016-10-17 18:11:23,202 WARN org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService: fair-scheduler.xml not found on the classpath."
@vathanlal It is very strange you don't have anything in .log - not even INFO messages? Or you think those are not relevant? Could you post content of .log file?
@yufeldman Sorry actually I put only the WARNING here.. My log file is like as below
`STARTUP_MSG: build = https://git-wip-us.apache.org/repos/asf/hadoop.git -r b165c4fe8a74265c792ce23f546c64604acf0e41; compiled by 'jenkins' on 2016-01-26T00:08Z
STARTUP_MSG: java = 1.8.0_72 ****/
2016-10-17 18:11:18,549 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: registered UNIX signal handlers for [TERM, HUP, INT]
2016-10-17 18:11:19,614 INFO org.apache.hadoop.conf.Configuration: found resource core-site.xml at file:/usr/local/hadoop/etc/hadoop/core-site.xml
2016-10-17 18:11:19,889 INFO org.apache.hadoop.security.Groups: clearing userToGroupsMap cache
2016-10-17 18:11:20,236 INFO org.apache.hadoop.conf.Configuration: found resource yarn-site.xml at file:/usr/local/hadoop/etc/hadoop/yarn-site.xml
2016-10-17 18:11:21,447 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.resourcemanager.RMFatalEventType for class org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMFatalEventDispatcher
2016-10-17 18:11:22,312 INFO org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: NMTokenKeyRollingInterval: 86400000ms and NMTokenKeyActivationDelay: 900000ms
2016-10-17 18:11:22,337 INFO org.apache.hadoop.yarn.server.resourcemanager.security.RMContainerTokenSecretManager: ContainerTokenKeyRollingInterval: 86400000ms and ContainerTokenKeyActivationDelay: 900000ms
2016-10-17 18:11:22,352 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: AMRMTokenKeyRollingInterval: 86400000ms and AMRMTokenKeyActivationDelay: 900000 ms
2016-10-17 18:11:22,474 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStoreEventType for class org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler
2016-10-17 18:11:22,477 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.resourcemanager.NodesListManagerEventType for class org.apache.hadoop.yarn.server.resourcemanager.NodesListManager
2016-10-17 18:11:22,492 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Using Scheduler: org.apache.myriad.scheduler.yarn.MyriadFairScheduler
2016-10-17 18:11:22,562 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeEventType for class org.apache.myriad.scheduler.yarn.RMNodeEventHandler
2016-10-17 18:11:22,565 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.resourcemanager.scheduler.event.SchedulerEventType for class org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher
2016-10-17 18:11:22,566 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppEventType for class org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher
2016-10-17 18:11:22,566 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptEventType for class org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher
2016-10-17 18:11:22,567 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeEventType for class org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$NodeEventDispatcher
2016-10-17 18:11:22,727 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2016-10-17 18:11:23,021 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2016-10-17 18:11:23,021 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: ResourceManager metrics system started
2016-10-17 18:11:23,068 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.resourcemanager.RMAppManagerEventType for class org.apache.hadoop.yarn.server.resourcemanager.RMAppManager
2016-10-17 18:11:23,081 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncherEventType for class org.apache.hadoop.yarn.server.resourcemanager.amlauncher.ApplicationMasterLauncher
2016-10-17 18:11:23,086 INFO org.apache.hadoop.yarn.server.resourcemanager.RMNMInfo: Registered RMNMInfo MBean
2016-10-17 18:11:23,097 INFO org.apache.hadoop.yarn.security.YarnAuthorizationProvider: org.apache.hadoop.yarn.security.ConfiguredYarnAuthorizer is instiantiated.
2016-10-17 18:11:23,099 INFO org.apache.hadoop.util.HostsFileReader: Refreshing hosts (include/exclude) list
2016-10-17 18:11:23,202 WARN org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService: fair-scheduler.xml not found on the classpath.
2016-10-17 18:11:23,244 INFO org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsPublisher: YARN system metrics publishing service is not enabled
2016-10-17 18:11:23,244 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioning to active state
2016-10-17 18:11:23,296 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Updating AMRMToken
2016-10-17 18:11:23,297 INFO org.apache.hadoop.yarn.server.resourcemanager.security.RMContainerTokenSecretManager: Rolling master-key for container-tokens
2016-10-17 18:11:23,297 INFO org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: Rolling master-key for nm-tokens
2016-10-17 18:11:23,297 INFO org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: Updating the current master key for generating delegation tokens
2016-10-17 18:11:23,298 INFO org.apache.hadoop.yarn.server.resourcemanager.security.RMDelegationTokenSecretManager: storing master key with keyID 1
2016-10-17 18:11:23,299 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Storing RMDTMasterKey.
2016-10-17 18:11:23,318 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.nodelabels.event.NodeLabelsStoreEventType for class org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager$ForwardingEventHandler
2016-10-17 18:11:23,301 INFO org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: Starting expired delegation token remover thread, tokenRemoverScanInterval=60 min(s)
2016-10-17 18:11:23,324 INFO org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: Updating the current master key for generating delegation tokens
2016-10-17 18:11:23,324 INFO org.apache.hadoop.yarn.server.resourcemanager.security.RMDelegationTokenSecretManager: storing master key with keyID 2
2016-10-17 18:11:23,325 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Storing RMDTMasterKey.
2016-10-17 18:11:26,516 INFO org.apache.myriad.scheduler.yarn.interceptor.CompositeInterceptor: Registered org.apache.myriad.scheduler.fgs.YarnNodeCapacityManager into the registry.
2016-10-17 18:11:26,516 INFO org.apache.myriad.scheduler.yarn.interceptor.CompositeInterceptor: Registered org.apache.myriad.scheduler.fgs.NMHeartBeatHandler into the registry.
2016-10-17 18:11:26,564 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2016-10-17 18:11:26,650 INFO org.mortbay.log: jetty-6.1.26
2016-10-17 18:11:29,648 INFO org.mortbay.log: Started SelectChannelConnector@0.0.0.0:8192
2016-10-17 18:11:29,649 INFO org.apache.myriad.Main: Initializing HealthChecks
2016-10-17 18:11:29,703 INFO org.apache.myriad.Main: Initializing Profiles
2016-10-17 18:11:29,710 INFO org.apache.myriad.scheduler.ServiceProfileManager: Adding profile zero with CPU: 0.0 and Memory: 0.0
2016-10-17 18:11:29,710 INFO org.apache.myriad.scheduler.ServiceProfileManager: Adding profile small with CPU: 1.0 and Memory: 256.0
2016-10-17 18:11:29,710 INFO org.apache.myriad.scheduler.ServiceProfileManager: Adding profile medium with CPU: 1.0 and Memory: 256.0
2016-10-17 18:11:29,710 INFO org.apache.myriad.scheduler.ServiceProfileManager: Adding profile large with CPU: 10.0 and Memory: 12288.0
2016-10-17 18:11:29,710 INFO org.apache.myriad.Main: Validating nmInstances..
2016-10-17 18:11:29,710 INFO org.apache.myriad.Main: Initializing initServiceConfigurations
2016-10-17 18:11:29,710 INFO org.apache.myriad.Main: Initializing Disruptors
2016-10-17 18:11:29,886 INFO org.apache.myriad.Main: Rebalancer is not turned on
2016-10-17 18:11:29,887 INFO org.apache.myriad.Main: Initializing Terminator
2016-10-17 18:11:29,902 INFO org.apache.myriad.Main: starting mesosDriver..
2016-10-17 18:11:29,902 INFO org.apache.myriad.scheduler.MyriadDriverManager: Starting driver...
2016-10-17 18:11:29,902 INFO org.apache.myriad.scheduler.MyriadDriver: Starting driver
2016-10-17 18:11:29,908 INFO org.apache.myriad.scheduler.MyriadDriver: Driver started with status: DRIVER_RUNNING
2016-10-17 18:11:29,909 INFO org.apache.myriad.scheduler.MyriadDriverManager: Driver started with status: DRIVER_RUNNING
2016-10-17 18:11:29,909 INFO org.apache.myriad.Main: started mesosDriver..
2016-10-17 18:11:29,909 INFO org.apache.myriad.scheduler.yarn.interceptor.CompositeInterceptor: Registered org.apache.myriad.policy.LeastAMNodesFirstPolicy into the registry.
2016-10-17 18:11:29,927 INFO org.apache.myriad.Main: Launching 1 NM(s) with profile medium
2016-10-17 18:11:29,928 INFO org.apache.myriad.scheduler.MyriadOperations: Adding 1 NM instances to cluster
2016-10-17 18:11:30,499 INFO org.apache.myriad.scheduler.event.handlers.RegisteredEventHandler: Received event: org.apache.myriad.scheduler.event.RegisteredEvent@69aba99c with frameworkId: value: "6215a35e-749e-4f27-bb50-f7c01650da80-0007"
2016-10-17 18:11:30,500 INFO org.apache.myriad.state.SchedulerState: Marked taskId nm.medium.36a17234-3818-4d8e-840e-304014eda3d2 pending, size of pending queue for nm is: 0
2016-10-17 18:11:30,501 INFO org.apache.myriad.scheduler.yarn.interceptor.MyriadInitializationInterceptor: Initialized myriad.
2016-10-17 18:11:30,686 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2016-10-17 18:11:30,734 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 8031
2016-10-17 18:11:30,776 INFO org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.yarn.server.api.ResourceTrackerPB to the server
2016-10-17 18:11:30,792 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2016-10-17 18:11:30,794 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 8031: starting
2016-10-17 18:11:30,999 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2016-10-17 18:11:31,018 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 8030
2016-10-17 18:11:31,064 INFO org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.yarn.api.ApplicationMasterProtocolPB to the server
2016-10-17 18:11:31,064 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2016-10-17 18:11:31,064 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 8030: starting
2016-10-17 18:11:31,428 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2016-10-17 18:11:31,442 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 8032
2016-10-17 18:11:31,454 INFO org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.yarn.api.ApplicationClientProtocolPB to the server
2016-10-17 18:11:31,471 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 8032: starting
2016-10-17 18:11:31,595 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioned to active state
2016-10-17 18:11:31,595 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2016-10-17 18:11:31,774 INFO org.apache.hadoop.security.authentication.server.AuthenticationFilter: Unable to initialize FileSignerSecretProvider, falling back to use random secrets.
2016-10-17 18:11:31,792 INFO org.apache.hadoop.http.HttpRequestLog: Http request log for http.requests.resourcemanager is not defined
2016-10-17 18:11:31,795 INFO org.apache.hadoop.http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter)
2016-10-17 18:11:31,827 INFO org.apache.hadoop.http.HttpServer2: Added filter RMAuthenticationFilter (class=org.apache.hadoop.yarn.server.security.http.RMAuthenticationFilter) to context cluster
2016-10-17 18:11:31,827 INFO org.apache.hadoop.http.HttpServer2: Added filter RMAuthenticationFilter (class=org.apache.hadoop.yarn.server.security.http.RMAuthenticationFilter) to context logs
2016-10-17 18:11:31,831 INFO org.apache.hadoop.http.HttpServer2: Added filter RMAuthenticationFilter (class=org.apache.hadoop.yarn.server.security.http.RMAuthenticationFilter) to context static
2016-10-17 18:11:31,832 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context cluster
2016-10-17 18:11:31,832 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs
2016-10-17 18:11:31,832 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context static
2016-10-17 18:11:31,848 INFO org.apache.hadoop.http.HttpServer2: adding path spec: /cluster/*
2016-10-17 18:11:31,852 INFO org.apache.hadoop.http.HttpServer2: adding path spec: /ws/*
2016-10-17 18:11:32,053 INFO org.apache.hadoop.yarn.webapp.WebApps: Registered webapp guice modules
2016-10-17 18:11:32,055 INFO org.apache.hadoop.http.HttpServer2: Jetty bound to port 8088
2016-10-17 18:11:32,055 INFO org.mortbay.log: jetty-6.1.26
2016-10-17 18:11:32,098 INFO org.mortbay.log: Extract jar:file:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-common-2.7.2.jar!/webapps/cluster to /tmp/Jetty_0_0_0_0_8088_cluster____u0rgz3/webapp
2016-10-17 18:11:32,699 INFO org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: Updating the current master key for generating delegation tokens
2016-10-17 18:11:32,714 INFO org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: Starting expired delegation token remover thread, tokenRemoverScanInterval=60 min(s)
2016-10-17 18:11:32,715 INFO org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: Updating the current master key for generating delegation tokens
2016-10-17 18:11:34,352 INFO org.mortbay.log: Started HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:8088
2016-10-17 18:11:34,352 INFO org.apache.hadoop.yarn.webapp.WebApps: Web app cluster started at 8088
2016-10-17 18:11:34,425 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2016-10-17 18:11:34,432 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 8033
2016-10-17 18:11:34,433 INFO org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.yarn.server.api.ResourceManagerAdministrationProtocolPB to the server
2016-10-17 18:11:34,435 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2016-10-17 18:11:34,435 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 8033: starting
2016-10-17 18:21:23,205 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler: Release request cache is cleaned up
2016-10-17 18:40:30,231 ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: RECEIVED SIGNAL 15: SIGTERM
2016-10-17 18:40:30,349 ERROR org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: ExpiredTokenRemover received java.lang.InterruptedException: sleep interrupted
2016-10-17 18:40:30,359 INFO org.mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:8088
2016-10-17 18:40:30,360 INFO org.apache.hadoop.ipc.Server: Stopping server on 8032
2016-10-17 18:40:30,365 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 8032
2016-10-17 18:40:30,369 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2016-10-17 18:40:30,369 INFO org.apache.hadoop.ipc.Server: Stopping server on 8033
2016-10-17 18:40:30,370 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 8033
2016-10-17 18:40:30,371 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioning to standby state
2016-10-17 18:40:30,371 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2016-10-17 18:40:30,372 WARN org.apache.hadoop.yarn.server.resourcemanager.amlauncher.ApplicationMasterLauncher: org.apache.hadoop.yarn.server.resourcemanager.amlauncher.ApplicationMasterLauncher$LauncherThread interrupted. Returning.
2016-10-17 18:40:30,372 INFO org.apache.hadoop.ipc.Server: Stopping server on 8030
2016-10-17 18:40:30,377 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 8030
2016-10-17 18:40:30,381 INFO org.apache.hadoop.ipc.Server: Stopping server on 8031
2016-10-17 18:40:30,382 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2016-10-17 18:40:30,413 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 8031
2016-10-17 18:40:30,414 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2016-10-17 18:40:30,414 INFO org.apache.hadoop.yarn.util.AbstractLivelinessMonitor: NMLivelinessMonitor thread interrupted
2016-10-17 18:40:30,414 ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Returning, interrupted : java.lang.InterruptedException
2016-10-17 18:40:30,415 WARN org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Update thread interrupted. Exiting.
2016-10-17 18:40:30,415 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: AsyncDispatcher is draining to stop, igonring any new events.
2016-10-17 18:40:30,415 ERROR org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: ExpiredTokenRemover received java.lang.InterruptedException: sleep interrupted
2016-10-17 18:40:30,415 INFO org.apache.hadoop.yarn.util.AbstractLivelinessMonitor: AMLivelinessMonitor thread interrupted
2016-10-17 18:40:30,415 INFO org.apache.hadoop.yarn.util.AbstractLivelinessMonitor: AMLivelinessMonitor thread interrupted
2016-10-17 18:40:30,415 INFO org.apache.hadoop.yarn.util.AbstractLivelinessMonitor: org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.ContainerAllocationExpirer thread interrupted
2016-10-17 18:40:30,416 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping ResourceManager metrics system...
2016-10-17 18:40:30,417 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: ResourceManager metrics system stopped.
2016-10-17 18:40:30,417 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: ResourceManager metrics system shutdown complete.
2016-10-17 18:40:30,417 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: AsyncDispatcher is draining to stop, igonring any new events.
2016-10-17 18:40:30,417 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioned to standby state
2016-10-17 18:40:30,418 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: SHUTDOWN_MSG: /`
@yufeldman I tried by changing configuration in myriad-config-default.yml. But still iam getting the NM in same pending state. Loooking at my yarn-root-resourcemanager-mesos.log I cant understand which one is causing the issue related to sudden declining of resources by Myriad as seen in the mesos master log. Because in the log I have only one warning related to fairscheduler as mentined above. Regarding the pending state of NM Iam getting only this info in the log.
`2016-10-17 18:11:29,927 INFO org.apache.myriad.Main: Launching 1 NM(s) with profile medium
2016-10-17 18:11:29,928 INFO org.apache.myriad.scheduler.MyriadOperations: Adding 1 NM instances to cluster
2016-10-17 18:11:30,499 INFO org.apache.myriad.scheduler.event.handlers.RegisteredEventHandler: Received event: org.apache.myriad.scheduler.event.RegisteredEvent@69aba99 with frameworkId: value: "6215a35e-749e-4f27-bb50-f7c01650da80-0007"
2016-10-17 18:11:30,500 INFO org.apache.myriad.state.SchedulerState: Marked taskId nm.medium.36a17234-3818-4d8e-840e-304014eda3d2 pending, size of pending queue for nm is: 0
2016-10-17 18:11:30,501 INFO org.apache.myriad.scheduler.yarn.interceptor.MyriadInitializationInterceptor: Initialized myriad.
2016-10-17 18:11:30,686 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue`
My myriad-config-default.yml is as shown below
`
mesosMaster: 10.0.2.19:5050
checkpoint: false
frameworkFailoverTimeout: 0
frameworkName: MyriadAlpha frameworkRole: "*" frameworkUser: root # User the Node Manager runs as, required if nodeManagerURI set, otherwise defaults to the user
frameworkSuperUser: root # To be deprecated, currently permissions need set by a superuser due to Mesos-1790. Must be
nativeLibrary: /usr/local/lib/libmesos.so zkServers: 10.0.2.19:2181 zkTimeout: 20000 restApiPort: 8192
profiles:
zero: # NMs launched with this profile dynamically obtain cpu/mem from Mesos
cpu: 0
mem: 0
small:
cpu: 1
mem: 256
medium:
cpu: 1
mem: 256
large:
cpu: 10
mem: 12288
nmInstances: # NMs to start with. Requires at least 1 NM with a non-zero profile.
medium: 1 #
yarnEnvironment: YARN_HOME: /usr/local/hadoop
Could you enable DEBUG at least for org.apache.myriad.scheduler package? Either offers don't go through or something else. You can add it to log4j.properties in etc/hadoop/
This what iam getting after enabling DEBUG for org.apache.myriad.scheduler
`2016-10-19 10:45:45,521 INFO org.apache.myriad.api.ClustersResource: Received flexup request. Profile: zero, Instances: 1, Constraints: null
2016-10-19 10:45:45,525 INFO org.apache.myriad.scheduler.MyriadOperations: Adding 1 NM instances to cluster
2016-10-19 10:45:45,525 INFO org.apache.myriad.state.SchedulerState: Marked taskId nm.zero.063a54db-f00e-47dc-8551-159095e29872 pending, size of pending queue for nm is: 0
2016-10-19 10:45:49,642 DEBUG org.apache.myriad.scheduler.event.handlers.ResourceOffersEventHandler: Received offers 2
2016-10-19 10:45:49,642 DEBUG org.apache.myriad.scheduler.event.handlers.ResourceOffersEventHandler: Pending tasks: [value: "nm.zero.063a54db-f00e-47dc-8551-159095e29872" ]
2016-10-19 10:45:49,643 DEBUG org.apache.myriad.scheduler.SchedulerUtils: Offer's hostname hadoop1 is unique
2016-10-19 10:45:49,643 DEBUG org.apache.myriad.scheduler.SchedulerUtils: Offer's hostname mesos is unique
2016-10-19 10:45:49,643 DEBUG org.apache.myriad.scheduler.event.handlers.ResourceOffersEventHandler: Declining offer id { value: "ecdb076c-1cd6-4560-8a0c-1ec04a04ffef-O3235" } framework_id { value: "ecdb076c-1cd6-4560-8a0c-1ec04a04ffef-0002" } slaveid { value: "ecdb076c-1cd6-4560-8a0c-1ec04a04ffef-S0" } hostname: "hadoop1" resources { name: "cpus" type: SCALAR scalar { value: 1.0 } role: "" } resources { name: "mem" type: SCALAR scalar { value: 1000.0 } role: "" } resources { name: "disk" type: SCALAR scalar { value: 9091.0 } role: "" } resources { name: "ports" type: RANGES ranges { range { begin: 31000 end: 32000 } } role: "_" } url { scheme: "http" address { hostname: "hadoop1" ip: "10.0.2.24" port: 5051 } path: "/slave(1)" } from slave hadoop1. 2016-10-19 10:45:49,645 DEBUG org.apache.myriad.scheduler.event.handlers.ResourceOffersEventHandler: Declining offer id { value: "ecdb076c-1cd6-4560-8a0c-1ec04a04ffef-O3236" } framework_id { value: "ecdb076c-1cd6-4560-8a0c-1ec04a04ffef-0002" } slaveid { value: "ecdb076c-1cd6-4560-8a0c-1ec04a04ffef-S1" } hostname: "mesos" resources { name: "ports" type: RANGES ranges { range { begin: 31000 end: 31122 } range { begin: 31124 end: 32000 } } role: "" } resources { name: "mem" type: SCALAR scalar { value: 488.0 } role: "_" } resources { name: "disk" type: SCALAR scalar { value: 8491.0 } role: "*" } url { scheme: "http" address { hostname: "mesos" ip: "10.0.2.19" port: 5051 } path: "/slave(1)" } from slave mesos. `
I feel you are not on 0.2 Myriad, but master. In any case I feel you may need to do remote debugging to see my offers are declined. I am testing on master now as well and I have hit few issues
Iam also getting this in the DEBUG.
2016-10-19 17:26:22,622 DEBUG org.apache.myriad.state.SchedulerState: Could not update state to state store as HA is disabled
I dont know whether this is creating the problem. I dont have anything in the yarn logs and also in the Mesos logs instead of Decline resources.
My two nodes are registered in mesos with each has 1 core and 1000MB RAM. I dont know whether a minimum of 1024MB is required for Myriad. In the myriad-config-default.yml I set the profile small and medium according to that but still the NM is in pending state.
You probably don't have enough resources. You can "cheat" on resources and set them manually for Mesos (--resources param for Mesos agent: e.g. --resources=cpus:12;mem:15000). At least it may give you a chance to overcome the issue of NMs not being able to spin up. Just be ware, that Mesos agent would not restart easily after a change of the resources.
@yufeldman I increased the resource in Mesos and still iam in the same state NM is not starting. But this time I got new error as below
ERROR org.apache.myriad.scheduler.event.handlers.ResourceOffersEventHandler: Exception thrown while trying to create a task for nm java.lang.IllegalArgumentException: bound must be positive
Hello,
I started Myriad successfully and it is nicely integrated with Mesos as a Framework. But it is showing Node Managers always as a pending task. When I checked the log of Mesos it is offering resource to Myriad but Myriad framework is declining the resources suddenly. I reduced the size of resources for Node Managers in myriad-config-default.yml. But still it is in the same state. I dont have much logs to look into for understanding what is causing the issue. Iam using Mesos 1.0.0 Hadoop 2.7.2 and Myriad executer 0.2.0. Is this a version compatibility issue between Mesos and Myriad?? Any help regarding this issue is really appreciated.
My yarn-root-resourcemanager-mesos.out is as below
Oct 17, 2016 4:11:57 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register INFO: Registering org.apache.myriad.api.ClustersResource as a root resource class Oct 17, 2016 4:11:57 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register INFO: Registering org.apache.myriad.api.ConfigurationResource as a root resource class Oct 17, 2016 4:11:57 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register INFO: Registering org.apache.myriad.api.SchedulerStateResource as a root resource class Oct 17, 2016 4:11:57 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register INFO: Registering org.apache.myriad.api.ControllerResource as a root resource class Oct 17, 2016 4:11:57 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register INFO: Registering org.apache.myriad.api.ArtifactsResource as a root resource class Oct 17, 2016 4:11:57 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register INFO: Registering org.codehaus.jackson.jaxrs.JacksonJaxbJsonProvider as a provider class Oct 17, 2016 4:11:57 PM com.sun.jersey.server.impl.application.WebApplicationImpl _initiate INFO: Initiating Jersey application, version 'Jersey: 1.9 09/02/2011 11:17 AM' Oct 17, 2016 4:11:58 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider INFO: Binding org.codehaus.jackson.jaxrs.JacksonJaxbJsonProvider to GuiceManagedComponentProvider with the scope "Singleton" Oct 17, 2016 4:11:59 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider INFO: Binding org.apache.myriad.api.ClustersResource to GuiceManagedComponentProvider with the scope "PerRequest" Oct 17, 2016 4:11:59 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider INFO: Binding org.apache.myriad.api.ConfigurationResource to GuiceManagedComponentProvider with the scope "PerRequest" Oct 17, 2016 4:11:59 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider INFO: Binding org.apache.myriad.api.SchedulerStateResource to GuiceManagedComponentProvider with the scope "PerRequest" Oct 17, 2016 4:11:59 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider INFO: Binding org.apache.myriad.api.ControllerResource to GuiceManagedComponentProvider with the scope "PerRequest" Oct 17, 2016 4:11:59 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider INFO: Binding org.apache.myriad.api.ArtifactsResource to GuiceManagedComponentProvider with the scope "PerRequest" I1017 16:11:59.989440 9720 sched.cpp:226] Version: 1.0.0 I1017 16:11:59.995721 9753 sched.cpp:330] New master detected at master@10.0.2.19:5050 I1017 16:11:59.996100 9753 sched.cpp:341] No credentials provided. Attempting to register without authentication I1017 16:11:59.998183 9748 sched.cpp:743] Framework registered with 6215a35e-749e-4f27-bb50-f7c01650da80-0006 Oct 17, 2016 4:12:01 PM com.google.inject.servlet.GuiceFilter setPipeline WARNING: Multiple Servlet injectors detected. This is a warning indicating that you have more than one GuiceFilter running in your web application. If this is deliberate, you may safely ignore this message. If this is NOT deliberate however, your application may not work as expected. Oct 17, 2016 4:12:02 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register INFO: Registering org.apache.hadoop.yarn.server.resourcemanager.webapp.JAXBContextResolver as a provider class Oct 17, 2016 4:12:02 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register INFO: Registering org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices as a root resource class Oct 17, 2016 4:12:02 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register INFO: Registering org.apache.hadoop.yarn.webapp.GenericExceptionHandler as a provider class Oct 17, 2016 4:12:02 PM com.sun.jersey.server.impl.application.WebApplicationImpl _initiate INFO: Initiating Jersey application, version 'Jersey: 1.9 09/02/2011 11:17 AM' Oct 17, 2016 4:12:02 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider INFO: Binding org.apache.hadoop.yarn.server.resourcemanager.webapp.JAXBContextResolver to GuiceManagedComponentProvider with the scope "Singleton" Oct 17, 2016 4:12:02 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider INFO: Binding org.apache.hadoop.yarn.webapp.GenericExceptionHandler to GuiceManagedComponentProvider with the scope "Singleton" Oct 17, 2016 4:12:03 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider INFO: Binding org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices to GuiceManagedComponentProvider with the scope "Singleton"