jprante / elasticsearch-knapsack

Knapsack plugin is an import/export tool for Elasticsearch
Apache License 2.0
472 stars 77 forks source link

Failed when exporting #1

Closed mdojwa closed 11 years ago

mdojwa commented 11 years ago

Hi,

I use ES 0.20.1. I installed this plugin. While exporting (curl -XPOST localhost:9200/users/chat/_export) I get this error: [2012-12-10 14:41:15,427][INFO ][rest.action ] [Cameron Hodge] starting export to users_chat Exception in thread "[Exporter Thread users_chat]" java.lang.NoClassDefFoundError: Could not initialize class org.xbib.io.StreamCodecService at org.elasticsearch.plugin.knapsack.io.tar.TarSession.(TarSession.java:38) at org.elasticsearch.plugin.knapsack.io.tar.TarConnection.createSession(TarConnection.java:50) at org.elasticsearch.plugin.knapsack.io.tar.TarConnection.createSession(TarConnection.java:32) at org.elasticsearch.rest.action.RestExportAction$1.run(RestExportAction.java:119)

And nothing more happens.

jprante commented 11 years ago

Are you using Java 6?

mdojwa commented 11 years ago

It looks like: java version "1.6.0_18"

Will upgrading Java version solve this problem?

jprante commented 11 years ago

Yes, xbib stream codec jar is Java 7. I have to backport to the Java 6 plugin code. Will fix it.

Sorry for the inconvenience, I forgot to mention it in the README.

mdojwa commented 11 years ago

No problem, I am testing it on test environment now. Thanks.

mdojwa commented 11 years ago

I upgraded java to 7 and now export works fine but I have a problem with importing. I have the following error while importing: [2012-12-10 15:33:38,723][INFO ][rest.action ] [Ch'od] starting import of userstest [2012-12-10 15:33:38,724][INFO ][rest.action ] [Ch'od] creating mapping userstest/login_history from import [2012-12-10 15:33:38,724][ERROR][rest.action ] [Ch'od] [userstest] missing org.elasticsearch.indices.IndexMissingException: [userstest] missing at org.elasticsearch.cluster.metadata.MetaData.convertFromWildcards(MetaData.java:602) at org.elasticsearch.cluster.metadata.MetaData.concreteIndices(MetaData.java:514) at org.elasticsearch.cluster.metadata.MetaData.concreteIndices(MetaData.java:497) at org.elasticsearch.action.admin.indices.mapping.put.TransportPutMappingAction.doExecute(TransportPutMappingAction.java:74) at org.elasticsearch.action.admin.indices.mapping.put.TransportPutMappingAction.doExecute(TransportPutMappingAction.java:41) at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:61) at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:47) at org.elasticsearch.client.node.NodeIndicesAdminClient.execute(NodeIndicesAdminClient.java:65) at org.elasticsearch.client.support.AbstractIndicesAdminClient.putMapping(AbstractIndicesAdminClient.java:285) at org.elasticsearch.rest.action.RestImportAction.createMapping(RestImportAction.java:218) at org.elasticsearch.rest.action.RestImportAction.access$800(RestImportAction.java:55) at org.elasticsearch.rest.action.RestImportAction$1.run(RestImportAction.java:135)

I have to add that I have template mappings that automatically created mappings for newly created indices. Does this cause the problem?

Thank you.

jprante commented 11 years ago

Hmmm, something is wrong... there should be a line in the log "creating index userstest" before "creating mapping".

Yes, I need to check if template mappings work differently.

Can you check if the tar archive contains a userstest entry with _settings? It should be at the beginning. Like:

tar ztvf userstest_login_history.tar.gz  | head | grep userstest/_settings
mdojwa commented 11 years ago

I renamed the index, so I exported index /users and wanted to import it to /userstest.

The archive has users/_settings {"users":{"settings":{"index.number_of_replicas":"0","index.version.created":"190899","index.number_of_shards":"5"}}}

I was trying to do it like it is described in GitHub Readme: "You can import the file to a new index with renaming your file to test2_test2.tar.gz and executing the import command: ..."

Is this OK or should I change the directory users to userstest i tar.gz archive too ?

jprante commented 11 years ago

No change in the tar is needed, just a rename. The idea is the index name from the import REST command overrides the index name in the tar. This command is supposed to work:

mv users_login_history.tar.gz userstest_login_history.tar.gz
curl -XPOST localhost:9200/userstest/login_history/_import

I will doublecheck if I missed something.

mdojwa commented 11 years ago

simple mv did not work but extracting data, renaming users to userstest and packing it againg to tar.gz fixed the problem. Currently I have another error :) [2012-12-10 15:56:24,525][INFO ][rest.action ] [MacPherran, Mary "Skeeter"] cluster 'yellow' check before import of userstest [2012-12-10 15:56:24,542][INFO ][rest.action ] [MacPherran, Mary "Skeeter"] starting import of userstest [2012-12-10 15:56:24,544][WARN ][rest.action ] [MacPherran, Mary "Skeeter"] skipping entry userstest/ [2012-12-10 15:56:24,544][INFO ][rest.action ] [MacPherran, Mary "Skeeter"] creating index userstest from import [2012-12-10 15:56:24,594][INFO ][cluster.metadata ] [MacPherran, Mary "Skeeter"] [userstest] creating index, cause [api], shards [5]/[1], mappings [login_history, chat, left_msg, availability_stats] [2012-12-10 15:56:24,724][INFO ][rest.action ] [MacPherran, Mary "Skeeter"] creating mapping userstest/availability_stats from import [2012-12-10 15:56:24,725][ERROR][rest.action ] [MacPherran, Mary "Skeeter"] Validation Failed: 1: mapping source is missing; org.elasticsearch.action.ActionRequestValidationException: Validation Failed: 1: mapping source is missing; at org.elasticsearch.action.ValidateActions.addValidationError(ValidateActions.java:29) at org.elasticsearch.action.admin.indices.mapping.put.PutMappingRequest.validate(PutMappingRequest.java:83) at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:55) at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:47) at org.elasticsearch.client.node.NodeIndicesAdminClient.execute(NodeIndicesAdminClient.java:65) at org.elasticsearch.client.support.AbstractIndicesAdminClient.putMapping(AbstractIndicesAdminClient.java:285) at org.elasticsearch.rest.action.RestImportAction.createMapping(RestImportAction.java:218) at org.elasticsearch.rest.action.RestImportAction.access$800(RestImportAction.java:55) at org.elasticsearch.rest.action.RestImportAction$1.run(RestImportAction.java:135)

jprante commented 11 years ago

This error is due to the repackaging. I did not run tests with tars that are not created with _export yet. There are directory entries and other things I need to handle more carefully.

mdojwa commented 11 years ago

I wanted to use this tool to copy index to another one :) Please let me know when it is possible. Thank you.

jprante commented 11 years ago

Just released 1.0.1 with a fix to get things going :)

mdojwa commented 11 years ago

Thank you :) Looks like it works now but still not right. It imported only one type (chat) from 4 types (chat, login_history, left_msg, availability_stats) and imported only 682 chat documents from 15102. This is the log: [2012-12-11 11:29:42,566][INFO ][rest.action ] [MacPherran, Mary "Skeeter"] cluster 'yellow' check before exporting to users [2012-12-11 11:29:42,567][INFO ][rest.action ] [MacPherran, Mary "Skeeter"] starting export to users [2012-12-11 11:29:44,436][INFO ][rest.action ] [MacPherran, Mary "Skeeter"] export to users completed [2012-12-11 11:30:30,282][INFO ][rest.action ] [MacPherran, Mary "Skeeter"] cluster 'yellow' check before import of userstest [2012-12-11 11:30:30,331][INFO ][rest.action ] [MacPherran, Mary "Skeeter"] starting import of userstest [2012-12-11 11:30:30,337][INFO ][rest.action ] [MacPherran, Mary "Skeeter"] submitting new bulk request (100 docs, 1 requests currently active) [2012-12-11 11:30:30,343][INFO ][rest.action ] [MacPherran, Mary "Skeeter"] submitting new bulk request (100 docs, 2 requests currently active) [2012-12-11 11:30:30,348][INFO ][rest.action ] [MacPherran, Mary "Skeeter"] submitting new bulk request (100 docs, 3 requests currently active) [2012-12-11 11:30:30,355][INFO ][rest.action ] [MacPherran, Mary "Skeeter"] submitting new bulk request (100 docs, 4 requests currently active) [2012-12-11 11:30:30,360][INFO ][rest.action ] [MacPherran, Mary "Skeeter"] submitting new bulk request (100 docs, 5 requests currently active) [2012-12-11 11:30:30,363][INFO ][rest.action ] [MacPherran, Mary "Skeeter"] submitting new bulk request (100 docs, 6 requests currently active) [2012-12-11 11:30:30,366][INFO ][rest.action ] [MacPherran, Mary "Skeeter"] creating mapping userstest/left_msg from import [2012-12-11 11:30:30,367][ERROR][rest.action ] [MacPherran, Mary "Skeeter"] [userstest] missing org.elasticsearch.indices.IndexMissingException: [userstest] missing at org.elasticsearch.cluster.metadata.MetaData.convertFromWildcards(MetaData.java:602) at org.elasticsearch.cluster.metadata.MetaData.concreteIndices(MetaData.java:514) at org.elasticsearch.cluster.metadata.MetaData.concreteIndices(MetaData.java:497) at org.elasticsearch.action.admin.indices.mapping.put.TransportPutMappingAction.doExecute(TransportPutMappingAction.java:74) at org.elasticsearch.action.admin.indices.mapping.put.TransportPutMappingAction.doExecute(TransportPutMappingAction.java:41) at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:61) at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:47) at org.elasticsearch.client.node.NodeIndicesAdminClient.execute(NodeIndicesAdminClient.java:65) at org.elasticsearch.client.support.AbstractIndicesAdminClient.putMapping(AbstractIndicesAdminClient.java:285) at org.elasticsearch.rest.action.RestImportAction.createMapping(RestImportAction.java:218) at org.elasticsearch.rest.action.RestImportAction.access$800(RestImportAction.java:55) at org.elasticsearch.rest.action.RestImportAction$1.run(RestImportAction.java:135) [2012-12-11 11:30:30,368][INFO ][rest.action ] [MacPherran, Mary "Skeeter"] submitting new bulk request (82 docs, 7 requests currently active) [2012-12-11 11:30:30,368][INFO ][rest.action ] [MacPherran, Mary "Skeeter"] waiting for 7 active bulk requests [2012-12-11 11:30:30,421][INFO ][cluster.metadata ] [MacPherran, Mary "Skeeter"] [userstest] creating index, cause [auto(bulk api)], shards [5]/[1], mappings [login_history, chat, left_msg, availability_stats] [2012-12-11 11:30:31,064][INFO ][rest.action ] [MacPherran, Mary "Skeeter"] bulk request success (700 millis, 100 docs, total of 100 docs) [2012-12-11 11:30:31,076][INFO ][rest.action ] [MacPherran, Mary "Skeeter"] waiting for 6 active bulk requests [2012-12-11 11:30:31,078][INFO ][rest.action ] [MacPherran, Mary "Skeeter"] bulk request success (694 millis, 82 docs, total of 182 docs) [2012-12-11 11:30:31,078][INFO ][rest.action ] [MacPherran, Mary "Skeeter"] waiting for 5 active bulk requests [2012-12-11 11:30:31,211][INFO ][rest.action ] [MacPherran, Mary "Skeeter"] bulk request success (851 millis, 100 docs, total of 282 docs) [2012-12-11 11:30:31,216][INFO ][rest.action ] [MacPherran, Mary "Skeeter"] bulk request success (878 millis, 100 docs, total of 382 docs) [2012-12-11 11:30:31,217][INFO ][rest.action ] [MacPherran, Mary "Skeeter"] waiting for 4 active bulk requests [2012-12-11 11:30:31,228][INFO ][rest.action ] [MacPherran, Mary "Skeeter"] bulk request success (872 millis, 100 docs, total of 482 docs) [2012-12-11 11:30:31,232][INFO ][rest.action ] [MacPherran, Mary "Skeeter"] waiting for 2 active bulk requests [2012-12-11 11:30:31,235][INFO ][rest.action ] [MacPherran, Mary "Skeeter"] bulk request success (887 millis, 100 docs, total of 582 docs) [2012-12-11 11:30:31,240][INFO ][rest.action ] [MacPherran, Mary "Skeeter"] bulk request success (897 millis, 100 docs, total of 682 docs) [2012-12-11 11:30:31,241][INFO ][rest.action ] [MacPherran, Mary "Skeeter"] waiting for 1 active bulk requests

I appreciate your help with this, I think this tool is the best (currently known for me) way for changing mappings of index (eg changing list to nested). I create new mapping template and copy data from old index to new one (with new compatibile with old data mapping), in this case from users to userstest. I think this toll would solve this problem :)

jprante commented 11 years ago

Thanks for the feedback!

Slowly it evolves.

I hope 1.0.2 is more useful now.

mdojwa commented 11 years ago

Hi,

This time it imported all types but still not all documents. It imported 52223 of 97351 documents. The only error I found is: [2012-12-12 10:08:07,593][ERROR][rest.action ] [Nebulo] Merge failed with failures {[mapper [time_to] has different index values, mapper [time_to] has different store values]} org.elasticsearch.index.mapper.MergeMappingException: Merge failed with failures {[mapper [time_to] has different index values, mapper [time_to] has different store values]} at org.elasticsearch.cluster.metadata.MetaDataMappingService$4.execute(MetaDataMappingService.java:317) at org.elasticsearch.cluster.service.InternalClusterService$2.run(InternalClusterService.java:223) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722)

This error appeared only once.

But it is always better :) Keep going :)

Best regards.

jprante commented 11 years ago

Dynamic mapping is going on, that is kind of strange. Something is going on I have not been considering yet.

In the tar archive, the order of exported documents is not necessarily the order of the insertion, so "merge failed with failures" could have the reason in different mapping interpretations.

Bulk insertions are submitted asynchronously, which implies the possibility of another document ordering when they are imported.

Could this be related to a remapping?

If you import more than one index, but remap the import to a single new index, with the same type, all docs with same ID will collapse into that new index, that could explain why there are fewer docs.

mdojwa commented 11 years ago

I import only one index (/users) to new index (/userstest). The only difference in mappings is that in /userstest 'operators' list is nested and in /users not. All the data is compatibile with new mapping. There is no document ID collision because of importing from a single index.

jprante commented 11 years ago

I exported the settings in a wrong format. I think 1.0.3 works better.

mdojwa commented 11 years ago

Hi,

Thank you for your help. I found an easier way for reindexing (copying data between indices): http://blogs.perl.org/users/clinton_gormley/2011/04/elasticsearchpm-v036-now-with-extra-sugar.html

But I will check your version today and let you know if it works fine. If not I still can help with testing :)

mdojwa commented 11 years ago

It seems that plugin installation does not work now:

plugin -install jprante/elasticsearch-knapsack/1.0.3 -> Installing jprante/elasticsearch-knapsack/1.0.3... Trying https://github.com/downloads/jprante/elasticsearch-knapsack/elasticsearch-knapsack-1.0.3.zip... Trying https://github.com/jprante/elasticsearch-knapsack/zipball/v1.0.3... Failed to install jprante/elasticsearch-knapsack/1.0.3, reason: failed to download

jprante commented 11 years ago

Yes. Github disabled uploads.

https://github.com/blog/1302-goodbye-uploads

I offered a workaround to the PluginManager to re-enable auto downloads

https://github.com/elasticsearch/elasticsearch/pull/2477

In the meantime, please use the long format

./bin/plugin -url https://raw.github.com/jprante/elasticsearch-knapsack/master/downloads/elasticsearch-knapsack-1.0.3.zip -install knapsack
mdojwa commented 11 years ago

Now it's working fine :) Thanks.