elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
1.46k stars 24.88k forks source link

Handle snapshot restore failure on missing synonym and analysis modules better #22777

Closed ppf2 closed 2 years ago

ppf2 commented 7 years ago

5.1.1

Currently, if you restore a snapshot to a cluster that does not have the synonym files set up or the analysis modules installed matching the cluster with the source index, the restore process will fail with these exceptions:

[2017-01-24T13:55:48,767][WARN ][o.e.i.c.IndicesClusterStateService] [sf17VvB] [[vsp.2017-01-17][2]] marking and sending shard failed due to [failed to create index]
java.lang.IllegalArgumentException: IOException while reading synonyms_path_path: /Users/pius/Elastic/ElasticStack_5_1/5.1.1/elasticsearch-5.1.1/config/address_synonyms.txt
    at org.elasticsearch.index.analysis.Analysis.getReaderFromFile(Analysis.java:296) ~[elasticsearch-5.1.1.jar:5.1.1]
    at org.elasticsearch.index.analysis.SynonymTokenFilterFactory.<init>(SynonymTokenFilterFactory.java:59) ~[elasticsearch-5.1.1.jar:5.1.1]
    at org.elasticsearch.index.analysis.AnalysisRegistry.lambda$buildTokenFilterFactories$1(AnalysisRegistry.java:165) ~[elasticsearch-5.1.1.jar:5.1.1]
    at org.elasticsearch.index.analysis.AnalysisRegistry$1.get(AnalysisRegistry.java:252) ~[elasticsearch-5.1.1.jar:5.1.1]
    at org.elasticsearch.index.analysis.AnalysisRegistry.buildMapping(AnalysisRegistry.java:290) ~[elasticsearch-5.1.1.jar:5.1.1]
    at org.elasticsearch.index.analysis.AnalysisRegistry.buildTokenFilterFactories(AnalysisRegistry.java:166) ~[elasticsearch-5.1.1.jar:5.1.1]
    at org.elasticsearch.index.analysis.AnalysisRegistry.build(AnalysisRegistry.java:152) ~[elasticsearch-5.1.1.jar:5.1.1]
    at org.elasticsearch.index.IndexService.<init>(IndexService.java:145) ~[elasticsearch-5.1.1.jar:5.1.1]
    at org.elasticsearch.index.IndexModule.newIndexService(IndexModule.java:363) ~[elasticsearch-5.1.1.jar:5.1.1]
    at org.elasticsearch.indices.IndicesService.createIndexService(IndicesService.java:429) ~[elasticsearch-5.1.1.jar:5.1.1]
    at org.elasticsearch.indices.IndicesService.createIndex(IndicesService.java:394) ~[elasticsearch-5.1.1.jar:5.1.1]
    at org.elasticsearch.indices.IndicesService.createIndex(IndicesService.java:148) ~[elasticsearch-5.1.1.jar:5.1.1]
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.createIndices(IndicesClusterStateService.java:434) [elasticsearch-5.1.1.jar:5.1.1]
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.clusterChanged(IndicesClusterStateService.java:196) [elasticsearch-5.1.1.jar:5.1.1]
    at org.elasticsearch.cluster.service.ClusterService.runTasksForExecutor(ClusterService.java:736) [elasticsearch-5.1.1.jar:5.1.1]
    at org.elasticsearch.cluster.service.ClusterService$UpdateTask.run(ClusterService.java:920) [elasticsearch-5.1.1.jar:5.1.1]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:458) [elasticsearch-5.1.1.jar:5.1.1]
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:238) [elasticsearch-5.1.1.jar:5.1.1]
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:201) [elasticsearch-5.1.1.jar:5.1.1]
    at 

And for missing modules:

[2017-01-24T14:02:07,813][WARN ][o.e.c.a.s.ShardStateAction] [sf17VvB] [vsp.2017-01-17][4] received shard failed for shard id [[vsp.2017-01-17][4]], allocation id [8cx7YIs4S6-mFzjw0SS08Q], primary term [0], message [failed to create index], failure [IllegalArgumentException[Unknown tokenfilter type [phonetic] for [dbl_metaphone]]]
java.lang.IllegalArgumentException: Unknown tokenfilter type [phonetic] for [dbl_metaphone]
    at org.elasticsearch.index.analysis.AnalysisRegistry.getAnalysisProvider(AnalysisRegistry.java:336) ~[elasticsearch-5.1.1.jar:5.1.1]
    at org.elasticsearch.index.analysis.AnalysisRegistry.buildMapping(AnalysisRegistry.java:289) ~[elasticsearch-5.1.1.jar:5.1.1]
    at org.elasticsearch.index.analysis.AnalysisRegistry.buildTokenFilterFactories(AnalysisRegistry.java:166) ~[elasticsearch-5.1.1.jar:5.1.1]
    at org.elasticsearch.index.analysis.AnalysisRegistry.build(AnalysisRegistry.java:152) ~[elasticsearch-5.1.1.jar:5.1.1]
    at org.elasticsearch.index.IndexService.<init>(IndexService.java:145) ~[elasticsearch-5.1.1.jar:5.1.1]
    at org.elasticsearch.index.IndexModule.newIndexService(IndexModule.java:363) ~[elasticsearch-5.1.1.jar:5.1.1]

Current behaviors

For something fundamental that will cause the snapshot restore process to fail right away and creating red shards in the cluster, it will be nice to handle this more gracefully and fail fast and provide a response to the end user and not create the red shards (since didn't even made it to the recovery piece there's nothing in the recovery apis). It will also be helpful to document the requirement of having the same custom analysis setup in the target cluster in order for the snapshot restore to work.

Note that the cluster allocation explain api will help here. But it will be a nicer experience if there's some setup validation we can do for the restore process upfront (before the recovery stage), and be able to report back what they are missing in their target cluster in order to restore the snapshot, etc.. We can start with documentation (currently, in the monitoring snapshot/restore status section, we only talk about _cat/recovery api as a way to track recovery status, this is an opportunity to advertise the use of the new cluster allocation explain api to determine the cause of snapshot/restore failures that occur even before it gets into the recovery stage).

original-brownbear commented 2 years ago

Closing this one, it doesn't look like this is something we are going to do anytime soon. Also this might be mitigated by changes to the way these modules work in e.g. https://github.com/elastic/elasticsearch/issues/38523