vimeo / graph-explorer

A graphite dashboard powered by structured metrics
http://vimeo.github.io/graph-explorer/
Apache License 2.0
1.06k stars 93 forks source link

Elasticsearch shard needs time to activate after running update_metrics.py #43

Closed dmac closed 11 years ago

dmac commented 11 years ago

Starting with an empty elasticsearch, if you run update_metrics.py it can fail with this error:

$ ./update_metrics.py
2013-08-16 02:30:46,132 - update_metrics - INFO - fetching/saving metrics from graphite...
2013-08-16 02:30:46,136 - update_metrics - INFO - generating structured metrics data...
2013-08-16 02:30:46,136 - update_metrics - DEBUG - loading metrics
2013-08-16 02:30:46,137 - update_metrics - DEBUG - removing outdated targets
2013-08-16 02:30:47,305 - update_metrics - ERROR - sorry, something went wrong: (ElasticException(...), 'ElasticSearch Error: {"error":"SearchPhaseExecutionException[Failed to execute phase [init_scan], total failure; shardFailures {[_na_][graphite_metrics][0]: No active shards}]","status":500}')

However, waiting a minute and then rerunning update_metrics.py will succeed.

$ ./update_metrics.py
2013-08-16 02:38:49,983 - update_metrics - INFO - fetching/saving metrics from graphite...
2013-08-16 02:38:49,990 - update_metrics - INFO - generating structured metrics data...
2013-08-16 02:38:49,990 - update_metrics - DEBUG - loading metrics
2013-08-16 02:38:49,991 - update_metrics - DEBUG - removing outdated targets
2013-08-16 02:38:50,052 - update_metrics - DEBUG - removed 0 metrics from elasticsearch
2013-08-16 02:38:50,053 - update_metrics - DEBUG - updating targets
2013-08-16 02:38:50,103 - update_metrics - DEBUG - indexed 16 metrics
2013-08-16 02:38:50,103 - update_metrics - INFO - success!

The theory is that the elasticsearch shard needs some time to activate, and graph-explorer should handle this gracefully.

fourk commented 11 years ago

Confirming that I also ran into this issue during first-time setup.

Dieterbe commented 11 years ago

hey, according to the elasticsearch folks, we should "just query the index state to make sure it's ready". (there's probably some shard active information in the json response for an index HEAD/GET). haven't gotten around to implementing this yet.

Dieterbe commented 11 years ago

i also observe this with a fresh ES install, and just running GE and querying. (skipping update_metrics):

  File "/home/dieter/workspaces/eclipse/graph-explorer/bottle.py", line 764, in _handle
    return route.call(**args)
  File "/home/dieter/workspaces/eclipse/graph-explorer/bottle.py", line 1575, in wrapper
    rv = callback(*a, **ka)
  File "/home/dieter/workspaces/eclipse/graph-explorer/app.py", line 383, in graphs
    return handle_graphs(query, False)
  File "/home/dieter/workspaces/eclipse/graph-explorer/app.py", line 405, in handle_graphs
    return render_graphs(query, deps=deps)
  File "/home/dieter/workspaces/eclipse/graph-explorer/app.py", line 448, in render_graphs
    targets_matching = s_metrics.matching(patterns)
  File "/home/dieter/workspaces/eclipse/graph-explorer/structured_metrics/__init__.py", line 305, in matching
    metrics = self.get_metrics(es_query)
  File "/home/dieter/workspaces/eclipse/graph-explorer/structured_metrics/__init__.py", line 274, in get_metrics
    "query": query,
  File "/home/dieter/workspaces/eclipse/graph-explorer/structured_metrics/rawes/rawes/elastic.py", line 58, in get
    return self.request('get', path, **kwargs)
  File "/home/dieter/workspaces/eclipse/graph-explorer/structured_metrics/rawes/rawes/elastic.py", line 83, in request
    return self.connection.request(method, new_path, **kwargs)
  File "/home/dieter/workspaces/eclipse/graph-explorer/structured_metrics/rawes/rawes/http_connection.py", line 41, in request
    return self._decode(response)
  File "/home/dieter/workspaces/eclipse/graph-explorer/structured_metrics/rawes/rawes/http_connection.py", line 54, in _decode
    result=decoded, status_code=response.status_code)
ElasticException: (ElasticException(...), 'ElasticSearch Error: {"error":"SearchPhaseExecutionException[Failed to execute phase [query_fetch], all shards failed]","status":503}')
Dieterbe commented 11 years ago

ok, just pushed a fix that should annihilate this. tested for both update_metrics and a fresh (unpopulated) graph-explorer run.

dmac commented 11 years ago

Awesome, thanks!

Dieterbe commented 11 years ago

let me know if you ran into any issues

asifalisoomro commented 8 years ago

I am facing following issue, when I open Kibana console, the page just shown option and entire page is blank,

2015-12-03 12:17:05,108][DEBUG][action.search.type ] [Shaper of Worlds] All shards failed for phase: [query_fetch] RemoteTransportException[[Shaper of Worlds][192.168.48.63:9300][indices:data/read/search[phase/query+fetch]]]; nested: QueryPhaseExecutionException[Result wi ndow is too large, from + size must be less than or equal to: [10000] but was [2147483647]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level parameter.]; Caused by: QueryPhaseExecutionException[Result window is too large, from + size must be less than or equal to: [10000] but was [2147483647]. See the scroll a pi for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level parameter.] at org.elasticsearch.search.internal.DefaultSearchContext.preProcess(DefaultSearchContext.java:198) at org.elasticsearch.search.query.QueryPhase.preProcess(QueryPhase.java:96) at org.elasticsearch.search.SearchService.createContext(SearchService.java:669) at org.elasticsearch.search.SearchService.createAndPutContext(SearchService.java:617) at org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:460) at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryFetchTransportHandler.messageReceived(SearchServiceTransportAction.java:39 2) at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryFetchTransportHandler.messageReceived(SearchServiceTransportAction.java:38 9) at org.elasticsearch.transport.TransportService$4.doRun(TransportService.java:350) at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) [2015-12-03 12:17:05,109][INFO ][rest.suppressed ] /.kibana/index-pattern/_search Params: {index=.kibana, fields=, type=index-pattern} Failed to execute phase [query_fetch], all shards failed; shardFailures {[2fC9Jb13TZeoV8y0gK1VPg][.kibana][0]: RemoteTransportException[[Shaper of Worlds][19 2.168.48.63:9300][indices:data/read/search[phase/query+fetch]]]; nested: QueryPhaseExecutionException[Result window is too large, from + size must be less th an or equal to: [10000] but was [2147483647]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [ index.max_result_window] index level parameter.]; } at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:228) at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$1.onFailure(TransportSearchTypeAction.java:174) at org.elasticsearch.action.ActionListenerResponseHandler.handleException(ActionListenerResponseHandler.java:46) at org.elasticsearch.transport.TransportService$DirectResponseChannel.processException(TransportService.java:821) at org.elasticsearch.transport.TransportService$DirectResponseChannel.sendResponse(TransportService.java:799) at org.elasticsearch.transport.TransportService$4.onFailure(TransportService.java:361) at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:42) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: QueryPhaseExecutionException[Result window is too large, from + size must be less than or equal to: [10000] but was [2147483647]. See the scroll a pi for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level parameter.] at org.elasticsearch.search.internal.DefaultSearchContext.preProcess(DefaultSearchContext.java:198) at org.elasticsearch.search.query.QueryPhase.preProcess(QueryPhase.java:96) at org.elasticsearch.search.SearchService.createContext(SearchService.java:669) at org.elasticsearch.search.SearchService.createAndPutContext(SearchService.java:617) at org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:460) at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryFetchTransportHandler.messageReceived(SearchServiceTransportAction.java:39 2) at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryFetchTransportHandler.messageReceived(SearchServiceTransportAction.java:38 9) at org.elasticsearch.transport.TransportService$4.doRun(TransportService.java:350) at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)

Dieterbe commented 8 years ago

this is a different issue. you say you're using kibana? in that case difinitely not related to graph-explorer.