manticoresoftware / es2ms

Elasticsearch -> Manticore Search data migration tool
Apache License 2.0
8 stars 1 forks source link

Can create index but not able to migrate data #7

Closed wangchaoforever closed 7 months ago

wangchaoforever commented 1 year ago

when I run php migrator.php with --dryrun, it shows there are 1165439 docs:

$ php migrator.php  --elasticsearch.host=192.168.0.195 --elasticsearch.port=9200 --indexes=csv_test --manticoresearch.host=192.168.0.191 --manticoresearch.port=9308 --manticoresearch.batch_size=10000 --threads=1 --dryrun
[2023-08-10T14:24:28]   Thread 0: Index csv_test: Getting index mapping
[2023-08-10T14:24:28]   Thread 0: Index csv_test: {"health":"yellow","status":"open","index":"csv_test","uuid":"L8FNgcPrTqW-GiwurHJnMQ","pri":"1","rep":"1","docs.count":"1165439","docs.deleted":"0","store.size":"926.8mb","pri.store.size":"926.8mb","mapping":{"properties":{"author_comment_count":{"type":"integer"},"comment_author":{"type":"text","fields":{"raw":{"type":"keyword"}}},"comment_id":{"type":"integer"},"comment_ranking":{"type":"integer"},"comment_text":{"type":"text"},"story_author":{"type":"text","fields":{"raw":{"type":"keyword"}}},"story_comment_count":{"type":"integer"},"story_id":{"type":"integer"},"story_text":{"type":"text"},"story_time":{"type":"integer"},"story_url":{"type":"text"}}},"type_mapping":{"author_comment_count":{"type":"integer"},"comment_author":{"type":"text"},"comment_id":{"type":"integer"},"comment_ranking":{"type":"integer"},"comment_text":{"type":"text"},"story_author":{"type":"text"},"story_comment_count":{"type":"integer"},"story_id":{"type":"integer"},"story_text":{"type":"text"},"story_time":{"type":"integer"},"story_url":{"type":"text"}}}

when i run elasticdump, it works fine:

$ NODE_OPTIONS=--max_old_space_size=4096  elasticdump --input=http://192.168.0.195:9200/ nfcorpus --limit 10 --noRefresh --type=data --output=$
(node:9042) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023.

Please migrate your code to use AWS SDK for JavaScript (v3).
For more information, check the migration guide at https://a.co/7PzMCcy
(Use `node --trace-warnings ...` to show where the warning was created)
{"_index":"csv_test","_type":"_doc","_id":"0","_score":1,"_source":{"story_id":3985069,"story_time":1337221782,"story_url":"http://androidcommunity.com/verizon-killing-off-grandfathered-unlimited-data-plans-this-summer-20120516/","story_text":"","story_author":"joedev","comment_id":3985756,"comment_text":"Makes sense for Verizon, and if they are being greedy, others will step in to fill the void.","comment_author":"Quizzy","comment_ranking":10,"author_comment_count":11,"story_comment_count":13}}
{"_index":"csv_test","_type":"_doc","_id":"1","_score":1,"_source":{"story_id":2481190,"story_time":1303738766,"story_url":"http://www.bbc.co.uk/news/magazine-13140772","story_text":"","story_author":"soitgoes","comment_id":2481521,"comment_text":"\"Made to play\" is a contradiction.","comment_author":"petervandijck","comment_ranking":9,"author_comment_count":1125,"story_comment_count":16}}
... ...

but when i start to migrate data, it can create index but shows no docs:

$ php migrator.php  --elasticsearch.host=192.168.0.195 --elasticsearch.port=9200 --indexes=csv_test --manticoresearch.host=192.168.0.191 --manticoresearch.port=9308 --manticoresearch.batch_size=10000 --threads=2
[2023-08-10T14:13:57]   Thread 0: Index csv_test: Getting index mapping
[2023-08-10T14:13:57]   Thread 0: Index csv_test: Creating index
[2023-08-10T14:13:57]   Thread 0: Index csv_test: Importing data
[2023-08-10T14:13:57]   Thread 0: Index csv_test: Imported 0 docs
[2023-08-10T14:13:57]   Thread 0: Index csv_test: Finished
sanikolaev commented 1 year ago

Hi @wangchaoforever

Thanks for letting us know about the issue. While we are looking into it, can you check if you can sync your data into Manticore without es2ms? Manticore 6.0.4 can ingest data from logstash, beats, fluentbit and vector.dev. Hopefully, elasticdump can write into Manticore too.

sanikolaev commented 1 year ago

This issue is blocked by https://github.com/manticoresoftware/manticoresearch/issues/1368 where we want to look deeper into how Manticore can integrate with elasticdump.

sanikolaev commented 10 months ago

This issue is blocked by https://github.com/manticoresoftware/manticoresearch/issues/1368

https://github.com/manticoresoftware/manticoresearch/issues/1368 is done.

Nick-S-2018 commented 10 months ago

Now the data migration from Elastic to Manticore can be done with the elasticdump tool. This feature is currently available in the dev version of Manticore (https://mnt.cr/nightly). Also, we recommend to execute migration in two stages: migrate data schema first and data itself after that. E.g.:

elasticdump --input=http://localhost:9200/your_index   --output=http://localhost:9308/your_index --type=mapping
elasticdump --input=http://localhost:9200/your_index   --output=http://localhost:9308/your_index --type=data

Otherwise, some data types can be converted incorrectly as described here (https://github.com/manticoresoftware/buddy-plugin-insert/issues/6)

sanikolaev commented 7 months ago

Closing as done.