Elasticsearch Merge Issues

noamt / elasticsearch-grails-plugin

The revived ElasticSearch grails plugin

Based on Graeme Rocher initial stub. Note that it is still in early stage.

Other

63 stars 83 forks source link

Elasticsearch Merge Issues #64

Closed motleycrew closed 9 years ago

motleycrew commented 9 years ago

Firstly, thank you for developing this plugin. It is very useful.

After upgrading to Elasticsearch plugin 0.0.3.6. We are getting merge exception, every-time we do run-app. This happens even when have disabled elasticsearch.index is disabled in bootstrap.groovy.

This is the case with plugin version 0.0.3.3

We are using: Elastic Search Server 1.4.0 Grails 2.4.3 Tomcat Plugin build ":tomcat:8.0.14"

Settings in config.groovy elasticSearch.datastoreImpl = 'mongoDatastore' elasticSearch.client.mode = 'transport' elasticSearch.bulkIndexOnStartup = false

Any pointers will be very helpful.

sircelsius commented 9 years ago

Hi, Can you provide the exact Elasticsearch error log, your domain classes and your indices' mapping?

sircelsius commented 9 years ago

Have you tried using the Get Mapping functionality to compare your domains' static searchable definitions to the mapping of your indices?

Sometimes it is necessary to clean your cluster and recreate the mapping (and therefore reindex your data) on startup.

motleycrew commented 9 years ago

This is the error we are getting

Error | 2014-11-18 13:08:15,370 [localhost-startStop-1] ERROR context.GrailsContextLoaderListener - Error initializing the application: Error creating bean with name 'searchableClassMappingConfigurator': Invocation of init method failed; nested exception is org.elasticsearch.index.mapper.MergeMappingException: Merge failed with failures {[mapper [firstName] has different store_term_vector values, mapper [firstName] has different store_term_vector_offsets values, mapper [firstName] has different store_term_vector_positions values, mapper [lastName] has different store_term_vector values, mapper [lastName] has different store_term_vector_offsets values, mapper [lastName] has different store_term_vector_positions values, mapper [rating] of different type, current_type [long], merged_type [integer], object mapping [hospital] can't be changed from non-nested to nested, mapper [yearsExperience] of different type, current_type [long], merged_type [integer], mapper [speciality] has different store_term_vector values, mapper [speciality] has different store_term_vector_offsets values, mapper [speciality] has different store_term_vector_positions values]} Message: Error creating bean with name 'searchableClassMappingConfigurator': Invocation of init method failed; nested exception is org.elasticsearch.index.mapper.MergeMappingException: Merge failed with failures {[mapper [firstName] has different store_term_vector values, mapper [firstName] has different store_term_vector_offsets values, mapper [firstName] has different store_term_vector_positions values, mapper [lastName] has different store_term_vector values, mapper [lastName] has different store_term_vector_offsets values, mapper [lastName] has different store_term_vector_positions values, mapper [rating] of different type, current_type [long], merged_type [integer], object mapping [hospital] can't be changed from non-nested to nested, mapper [yearsExperience] of different type, current_type [long], merged_type [integer], mapper [speciality] has different store_term_vector values, mapper [speciality] has different store_term_vector_offsets values, mapper [speciality] has different store_term_vector_positions values]} Line | Method ->> 262 | run in java.util.concurrent.FutureTask

Caused by MergeMappingException: Merge failed with failures {[mapper [firstName] has different store_term_vector values, mapper [firstName] has different store_term_vector_offsets values, mapper [firstName] has different store_term_vector_positions values, mapper [lastName] has different store_term_vector values, mapper [lastName] has different store_term_vector_offsets values, mapper [lastName] has different store_term_vector_positions values, mapper [rating] of different type, current_type [long], merged_type [integer], object mapping [hospital] can't be changed from non-nested to nested, mapper [yearsExperience] of different type, current_type [long], merged_type [integer], mapper [speciality] has different store_term_vector values, mapper [speciality] has different store_term_vector_offsets values, mapper [speciality] has different store_term_vector_positions values]} ->> 509 | execute in org.elasticsearch.cluster.metadata.MetaDataMappingService$4

| 328 | run in org.elasticsearch.cluster.service.InternalClusterService$UpdateTask | 153 | run . . . in org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable | 1142 | runWorker in java.util.concurrent.ThreadPoolExecutor | 617 | run . . . in java.util.concurrent.ThreadPoolExecutor$Worker ^ 745 | run in java.lang.Thread

Process finished with exit code 1

I deleted all mappings and indices file using ElasticHQ and deleted all indicis files. But everytime I start grails and it keeps giving this error.

sircelsius commented 9 years ago

Looking at the logs, it seems that you do have an existing index whose mapping is different from what the plugin is trying to create when starting the app. Are you certain that you did delete all the mappings? I'm not familiar enough with ElasticHQ to know how it works exactly, but maybe you could try deleting each index manually with curl:

curl -XDELETE localhost:9200/your.index.name

Then check that they were actually deleted using the following command:

curl -XGET localhost:9200/_cat/indices?v

Try this (I assume that losing your indexed data isn't too much of a problem, if it is we need to find another solution), and tell me if the problem persists after you have made sure that the indices are correctly deleted.

If this doesn't work, I think I will need to know a bit more about your domains to try to recreate the issue.

All that being said, we (meaning at my workplace) have made a routine of manually scraping our indices clean and reindexing all our data every time we deploy our app in production. It goes against our will to have a quick and easy deployment procedure but better safe than sorry.

P.-S. Now that I think of it, there are a couple of mistakes that I made in the past that made me lose quite some time (they may seem like silly questions, but it's always worth checking):

Are you certain you are using the transport mode? I once spent a few hours trying to understand why I had a similar bug until I realized that in the dev environment my client.mode was set to local, which meant I had two different ES servers running and therefore two different ports to which my curl commands could be sent.
Assuming you are indeed in transport mode, are you certain that you have set the elasticSearch.client.hosts configuration option correctly? It defaults to localhost:9300, but for some reason your ES server might be running on another port.

peh commented 9 years ago

hey @sircelsiussexy,

i am on the elasticsearch:0.0.3.7 and grails-2.4.3 and experience the described exception on every deploy

i don't even have to change anything domain related, after a "grails war" i cannot deploy as i always get "... has different store_term_vector_offsets values..." exceptions.

deleting the index fixes the issue but this feels so wrong as this also makes it impossible to deploy without a downtime (the running appserver either crashes, or recreated the index while the other one is starting up)

noamt commented 9 years ago

Have you tried comparing the mapping before and after the delete & re-write?

peh commented 9 years ago

i found a workaround: doing a grails clean-all, and deleting the target dir before running grails war creates a deployable war file. still a weird bug in my eyes.

marcoscarceles commented 9 years ago

+1 I'm finding this error as well, although in my case it is with genuine mapping updates. Is there any plan to deal with mappings updates? It feels like a missing functionality for the plugin to be Production-ready. I'm happy to help with the implementation if there is any preferred approach.

Probably something like this (choosing drop/alias option by config?): http://www.elasticsearch.org/blog/changing-mapping-with-zero-downtime/

sircelsius commented 9 years ago

I added a configuration option to enable deleting the whole index when a MergeMappingException is caught during the SearchableClassMappingConfigurator.installMappings method.

This solves the issue in our case, but it still needs a bit of testing. Use it at your own risk for the moment.

The exact way it works is the following:

In Config.groovy, I created a new option named deleteOnMergeMappingException (default false).
If that option is set to true, should a MergeMappingException be thrown by the elasticSearchClient.admin().indices().putMapping() for any of the searchable domains' mapping, the entire index is deleted and recreated.

Like I said, I haven't tested this fix extensively enough to say it will work everytime.

marcoscarceles commented 9 years ago

Hi @noamt,

as @sircelsiussexy I have been working on this and (as him) I got an implementation that based on configuration recreates the mapping if there was some sort of conflict. However, as I mentioned I think that a more future proof alternative would be to follow the default approach recommnded by elasticsearch itself here:

http://www.elasticsearch.org/blog/changing-mapping-with-zero-downtime/

I'm working on an implementation (fork) for our project that allows you to decide between different migration alternatives: none : current behaviour - throws exception delete : if an merging error happens, recreate the mapping, reindex based on config alias : version indexes with merge exceptions, reindex based on config.

It is getting quite too complex to create a PR without previous discussion, so I wanted to share the pseudocode that we were implementing to know your thoughts. We have some integrations tests around it and it is looking good so far (althought it's still WIP and requires some refinement). Also, if you believe there's a better channel for this, please let me know (I know other teams use waffle.io or gitter.im).

On SearchableClassMappingConfigurator.installMappings: Lines marked with + are existing code/path

    +For each mapping
        +if  !installedIndices.contains(scm.indexName)
            If scm.indexName does not exist
                If esConfig.migration.strategy is alias
                    Create index scm.indexName_v0
                    Create alias scm.indexName -> scm.indexName_v0
                Else
                    +Create index scm.indexName
        +Create mapping
        If there was an error
            Add mapping to list of failed mappings

    For each failed mapping
        If esConfig.migration.strategy is none
            +throw exception => Current behaviour
        If esConfig.migration.strategy is delete
            delete mapping
            If esConfig.migration.reindex is true
                Async - Reindex the docs on the mapping
                Mark the mapping as not to be indexed by bulkIndexOnStartup
        If strategy is alias
            If scm.indexName not marked as migrated
                mark as migrated (as there could be multiple failed mappings on the same index)
                If scm.indexName is an alias OR esConfig.migration.alias.ifIndexExists == delete
                    Create new version index (ie. scm.indexName_v6 if current is scm.indexName_v5 or scm.indexName_v0 if no versions exists)
                    If scm.indexName is an index (not an alias)
                        Delete scm.indexName, needed to create the correct alias

                If esConfig.migration.reindex is true
                    Async - Reindex the docs on the new index
                        -> On completion Point alias scm.indexName to scm.indexName_v*
                    Mark the mapping as not to be indexed by bulkIndexOnStartup
                Else
                    point alias scm.indexName to scm.indexName_v*
    ...

On ElasticsearchBootstrap: Lines marked with + are existing code/path

    +if bulkIndexOnStartup is true
        if no mapping excluded
            +index everything
        else
            index only non excluded indexes

This of course can be enhanced, as for example the current alias approach would require reindexing all documents on the index for all mappings whether it's mapping failed or not. Ie. index store and mappings book (~10000 documents) and author (~100 documents). An error on author would require an unnecessary reindexing 10000 books... This problem is also resolved on the link before and requires further use of aliases and storing multiple mappings on different indexes. Since this would require infra considerations (sharding) I left this for a future version and let the developer choose by config.

sircelsius commented 9 years ago

I'm impressed by the work you put into that, well played sir. :+1:

I'll look at your pseudo code in details tonight and let you know if I have any ideas / comments / questions.

sircelsius commented 9 years ago

Hi @marcos-carceles

I've read your work, it looks interesting. Here's a question that came to my mind while I was thinking about how to implement that:

Does the following code treat in any way the case were a failed mapping was needed (as a component for example) in another mapping?

     If there was an error
            Add mapping to list of failed mappings

My understanding is that should class B have an attribute of class A and the class A fail to be mapped due to a MergeMappingException, another exception will occur when trying to map class B. This might lead to other problems when creating this list of failed mappings.

So I'm not sure about treating each failed mapping after treating all the successful mappings. My implementation deleted the entire index and restarted when the first error was found.

Maybe the way to go should be the following:

try to create the mapping
upon catching the first MergeMappingException, restart mapping the entire set of mappings in a single alias index.
when the alias index is entirely mapped (and - if necessary - filled), replace the actual index with it.

My point is (and it may very well be that I did not entirely understand your idea) that I don't see the reason why we should have multiple alias indices. The MergeMappingException can only occur when the index already exists. Therefore the creation of the alias should only occur once, and should definitely delete any existing alias (although there should be none at that point).

How about this way of working? Once again, I may totally be missing the point .

+For each mapping
    +if  !installedIndices.contains(scm.indexName)
            +Create index scm.indexName
   try 
       +Create mapping
   catch MergeMappingException
        if esConfig.migration.strategy is none
             throw MergeMappingException
        if esConfig.migration.strategy is delete
            delete index
            restart mapping action
       if esConfig.migration.strategy is alias
           scm.indexName = scm.indexName  + "_alias"
           restart mapping action on alias index

if alias index exists
    replace original index with alias
...

This is missing the indexation of the data on the new index, I still don't know how to take that into consideration.

marcoscarceles commented 9 years ago

Hi @sircelsiussexy

My understanding is that should class B have an attribute of class A and the class A fail to be mapped due to a MergeMappingException, another exception will occur when trying to map class B.

Thank you for highlighting this. I wasn't aware of this behaviour. I thought that since A already existed on the index (although with an old inconsistent mapping) it would succeed. In that case it's true that errors would be better dealt with ASAP. In a way, since the whole index and related mappings are going to be recreated, the fact that B succeeds or not does not influence the end results at the end of the day, a new version of the index containing A and B is going to be created and both mappings rebuilt. It would be quicker to just skip mapping B on the old index completely (as in your example).

The reason why I postponed the migrations on a later loop was because an application could have multiple mappings on multiple indices and you could hit multiple mapping errors on the same index and I didn't wan't to create unused versions of the index... I think I'd explain myself better with an example:

I wanted to avoid

Index mypackage_v0 exists
Alias mypackage points to index mypackage_v0
Creating mapping for A (index mypackage, type a)
-> Genuine Error, not related to any other type
-> Migration creates mypackage_v1 and points alias to it
Creatng mapping for B (index mypackage, type a)
-> Genuine Error, not related to any other type
-> Migration creates mypackage_v2 and points alias to it

In this case we ended up with one too many indices.

Also, I intentionally didn't delete any indices or information, unless strictly necessary (ie. converting an index into an alias). Some teams may want to use this information in some other way. In the future we could have a config cleanup flag, so users can choose whether to delete old data. Regardless of the implementation I'd like to keep this behaviour.

I now see that my original approach needs refinement to save extra unnecessary mapping creations and yours would be quicker. I'll improve it asap. For the time being I have been creating some integration tests for all the cases I could think of. Not all cases are covered (still WIP) but all the existing ones are passing. I could keep them and use your algorithm.

Please have a look at them and let me know what you think:

https://github.com/marcos-carceles/elasticsearch-grails-plugin/blob/migration-alternatives/test/integration/org/grails/plugins/elasticsearch/mapping/MappingMigrationSpec.groovy

I'll keep working on repopulating the content on the newly created indices/mappings (the test case I'm working on right now) to have a truly zero downtime approach and as soon as I achieve that, include your optimisation.

Thank you very much for the heads up!

Cheers

marcoscarceles commented 9 years ago

I got a version passing all the tests here: https://github.com/marcos-carceles/elasticsearch-grails-plugin/tree/migration-alternatives

I'm going to test it in our app and apply @sircelsiussexy's improvement. Once happy I'll send a PR

Feel free to send me comments on it ;)

marcoscarceles commented 9 years ago

I believe #81 should fix this issue and be good to merge. Please have a look and drop me any comments