Closed petterik closed 11 years ago
Shuttle Test 1 - Nice Test: 1 Master / Hadoop namenode 1 Search head / Hadoop datanode 2 Search peers / Hadoop datanode
Once shuttl had been enabled on all the nodes the Search head displayed the following message: "Search results may be incomplete, peer ip-10-4-150-48-ec2-user's search ended prematurely. This may be caused by a variety of reason, please consult logs on peer for details!"
What logs should be consulted?
The web_service.log contained only two occurrences of ERROR where one was only mentioned in a INFO line. Found on all nodes. -"2013-02-26 21:25:35,925 ERROR [512d284fe72418550] base:427 - unable to retrieve the EAI _new descriptor for entity: cluster/config"
-"2013-02-26 21:25:35,926 INFO [512d284fe72418550] base:512 - [HTTP 404] https://127.0.0.1:8089/services/cluster/ config/_new; [{'code': None, 'text': "In handler 'clusterconfig': Invalid action for this internal handler (handler: clusterconfig, supported: disable|edit|members|_reload|remove|doc, wanted: new).", 'type': 'ERROR'}]"
:The ERROR is probably due to connection issue with the master which solved itself later. The INFO line is probably related.
:The master node contains a set of the following messages as well: "2013-02-26 21:36:49,311 DEBUG [512d2af1442f74a50] clustering:259 - Examining bucket _internal~3~8C58F69E-1047-4AFF-9E1D-9EB62B9B05D8: searchable_size=0, index_size=325368 - ERROR *****"
Indexed with a one gig big csv file. First set of errors:
"2013-02-26 23:23:36,270 ERROR com.splunk.shuttl.archiver.archive.ArchiveRestHandler: did="Sent archive bucket request" happened="got IOException" expected="request to succeed" exception="org.apache.http.NoHttpResponseException: The target server failed to respond" bucket_name="db_1350861209_1312434924_12" cause="null""
"2013-02-26 23:23:40,012 ERROR com.splunk.shuttl.archiver.archive.ArchiveRestHandler: did="Sent archive bucket request" happened="got IOException" expected="request to succeed" exception="org.apache.http.conn.HttpHostConnectException: Connection to http://ip-10-4-150-48.ec2.internal:9090 refused" bucket_name="db_1350894458_1340954376_3" cause="java.net.ConnectException: Connection refused""
:Connection was refused since my script to update the name node for each node had not been updated properly. Once the name node was specified properly in the hdfs.properties on each node it worked like a charm.
-Thawing works fine from the search head. Thawed buckets end up only on the node that originally indexed the data. -Flushing also works fine from the search head.
Shuttl Test 2 - Massive Test:
1 Master / Hadoop Namenode 1 Search head / Hadoop datanode 10 Search peers / Hadoop datanodes
Error 500 Shutdown in progress java.lang.IllegalStateException: Shutdown in progress at java.lang.ApplicationShutdownHooks.add(ApplicationShutdownHooks.java:57) at java.lang.Runtime.addShutdownHook(Runtime.java:209) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1439) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254) at com.splunk.shuttl.archiver.filesystem.hadoop.HadoopArchiveFileSystemFactory.doCreate(HadoopArchiveFileSystemFactory.java:53) at com.splunk.shuttl.archiver.filesystem.hadoop.HadoopArchiveFileSystemFactory.createWithPropertyFile(HadoopArchiveFileSystemFactory.java:44) at com.splunk.shuttl.archiver.filesystem.hadoop.HadoopArchiveFileSystemFactory.create(HadoopArchiveFileSystemFactory.java:37) at com.splunk.shuttl.archiver.filesystem.ArchiveFileSystemFactory.supportedArchiveFileSystem(ArchiveFileSystemFactory.java:105) at com.splunk.shuttl.archiver.filesystem.ArchiveFileSystemFactory.getByNameAndLocalFileSystemPaths(ArchiveFileSystemFactory.java:96) at com.splunk.shuttl.archiver.filesystem.ArchiveFileSystemFactory.getWithConfiguration(ArchiveFileSystemFactory.java:78) at com.splunk.shuttl.archiver.thaw.BucketThawerFactory.createWithConfigAndSplunkSettingsAndLocalFileSystemPaths(BucketThawerFactory.java:52) at com.splunk.shuttl.archiver.thaw.BucketThawerFactory.createDefaultThawer(BucketThawerFactory.java:41) at com.splunk.shuttl.server.mbeans.rest.ThawBucketsEndpoint.thawBuckets(ThawBucketsEndpoint.java:92) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60) at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$TypeOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:185) at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75) at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288) at com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108) at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84) at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1469) at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1400) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1349) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1339) at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416) at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:537) at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:708) at javax.servlet.http.HttpServlet.service(HttpServlet.java:802) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:594) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:485) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:521) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1065) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:412) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:999) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111) at org.eclipse.jetty.server.Server.handle(Server.java:351) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:451) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:916) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:634) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:230) at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:76) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:609) at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:45) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:599) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:534) at java.lang.Thread.run(Thread.java:679)
{"buckets":[],"failed":[]}curl: (52) Empty reply from server curl: c(u5r2l): E(m5p2t)y Ermepptlyy rferpolmy sferrovme rs erver curl: (52) Empty reply from server curl: (52) Empty reply from server curl: (52) Empty reply from server curl: (52) Empty reply from server curl: (52) Empty reply from server curl: (52) Empty reply from server curl: (52) Empty reply from server curl: (52) Empty reply from server curl: (52) Empty reply from server curl: (52) Empty reply from server curl: (52) Empty reply from server
:The following WARN were caught when spamming the script over and over.
2013-02-28 18:37:39,637 WARN com.splunk.shuttl.server.distributed.RequestOnSearchPeers: warning="Executed request on distributed peer" happened="java.lang.RuntimeException: Too many open files" result="will add to exceptions, which can be retrieved with getExceptions()"
2013-02-28 18:37:39,701 WARN com.splunk.shuttl.server.distributed.RequestOnSearchPeers: warning="Executed request on distributed peer" happened="java.lang.RuntimeException: java.net.SocketException: Too many open files" result="will add to exceptions, which can be retrieved with getExceptions()"
"failed":["java.lang.RuntimeException: java.io.IOException: Unable to delete directory /mnt/KlevisTestArea/splunk/var/lib/splunk/cluster_shuttl_test/thaweddb/db_1350851422_1312494429_26."]}
Setup:
Run happy path tests. You should be able to control all the Shuttl's from the UI of the Search Head. Thawing, Listing and Flushing. Run sad paths, crashing Splunks, Shuttls. Be evil.
Comment on anything that you find weird.