wxf4150 / esdump

elasticserch dump import export
8 stars 5 forks source link

Re-import of same index exported via esdump fails with error #3

Closed mrcdb closed 2 years ago

mrcdb commented 2 years ago

Hi,

I am trying to use the esdump tool for both exporting my ES indexes and re-importing them if needed.

The export process works just as expected for index_name, with the command:

./esdump export --index index_name  -o /var/archive/index_name.json.gz

but the re-import in new_index_name via

./esdump import --index new_index_name  -i /var/archive/index_name.json.gz

fails with the following error in stdout:

2021/11/09 15:14:20 row count 1034
2021/11/09 15:14:20 elastic: Error 400 (Bad Request): Validation Failed: 1: type is missing;2: type is missing;3: type is missing;4: type is missing;5: type is missing;6: type is missing;7: type is missing;8: type is missing;9: type is missing;10: type is missing;11: type is missing;12: type is missing;13: type is missing;14: type is missing;15: type is missing;16: type is missing;17: type is missing;18: type is missing;19: type is missing;20: type is missing;21: type is missing;22: type is missing;23: type is missing;24: type is missing;25: type is missing;26: type is missing;27: type is missing;28: type is missing;29: type is missing;30: type is missing;31: type is missing;32: type is missing;33: type is missing;34: type is missing;35: type is missing;36: type is missing;37: type is missing;38: type is missing;39: type is missing;40: type is missing;41: type is missing;42: type is missing;43: type is missing;44: type is missing;45: type is missing;46: type is missing;47: type is missing;48: type is missing;49: type is missing;50: type is missing;51: type is missing;52: type is missing;53: type is missing;54: type is missing;55: type is missing;56: type is missing;57: type is missing;58: type is missing;59: type is missing;60: type is missing;61: type is missing;62: type is missing;63: type is missing;

The error repeats indefinitely so I have to stop the script altogether. I am not sure that I should perform any preliminary configuration on the index new_index_name  (e.g. for index mapping), can you help on this?

wxf4150 commented 2 years ago

you should create new_index_name mapping first. in the new_index_name mapping,you can custom field mapping config: which field should indexed or disable,which field will used for sort or disable, which field is keywrod type ....

the customed index will more useful: fast for indexing/ save space/ seach result whill more exactly .....

mrcdb commented 2 years ago

Thanks for the clarification. Is it possibile to use esdump to extract the index mapping from (old) index_name ?

vfiset commented 2 years ago

Hi @wxf4150 ! sorry to hijack @mrcdb issue, but I stumble upon the same issue even if the mapping is there.

My goal is to use your super fast program to copy over my indexes from a source cluster to a destination cluster.

here are my steps:

  1. Create the index on the destination ES
  2. Copy the index mapping from source cluster to destination cluster
  3. ./esdump export --index my_index1 --es http://source.es:9200 -o ./my_index1.gz
  4. ./esdump import --index my_index1 --es http://destination.es:9200 -i ./my_index1.gz

When running the./esdump import command I get what @mrcdb is reporting:

elastic: Error 400 (Bad Request): Validation Failed: 1: type is missing;2: type is missing;3...

I would have expected to work if I have the same mapping on the destination cluster.

vfiset commented 2 years ago

most likely my problem is that I am using an old ES version (2.3) and you are using github.com/olivere/elastic/v7

edit: I've forked your code and used olivere/elastic/v3 with the same result. I am puzzled !

wxf4150 commented 2 years ago

most likely my problem is that I am using an old ES version (2.3) and you are using github.com/olivere/elastic/v7

edit: I've forked your code and used olivere/elastic/v3 with the same result. I am puzzled !

es v7 document are indexed under"index_name/{type_name}". the type_name is "_doc",it is fixed and default.
es v2.3 document are indexed under"index_name/{type_name}",and the type_name is a "variable" value ,you can set it equal index_name.

when indexing ,elasticv3-client should have set two parameter index_name and type_name

when you use this esdump with elasticv3-client to import, the code at BulkIndexRequest should set "type_name" too. https://github.com/wxf4150/esdump/blob/c2cc0d0a831b882faaf668c5319af98c9e5bfb72/cmds/cmds.go#L97

should be: req.Id(item.ID).Doc(item.RawData) .Type(xxx)

@vfiset

wxf4150 commented 2 years ago

Thanks for the clarification. Is it possibile to use esdump to extract the index mapping from (old) index_name ?

later ,i will add option, copy the index mapping config to the destination .

vfiset commented 2 years ago

Thanks @wxf4150 your program is so fast comparing to the nodejs implementation, it blows my mind!

wxf4150 commented 2 years ago

Thanks for the clarification. Is it possibile to use esdump to extract the index mapping from (old) index_name ?

later ,i will add option, copy the index mapping config to the destination .

index create have simple restful api,but can have lot of setting content. https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-create-index.html

example:

#get index mapping
curl  "localhost:9200/test/_mapping?pretty"

response:
{
  "tokens1234": {
    "mappings": {
      "properties": {
        "field1": {
          "type": "text"
        }
      }
    }
  }
}

#create new index
 curl -X PUT "localhost:9200/test

#update index fields
curl -X PUT "localhost:9200/test?pretty" -H 'Content-Type: application/json' -d'
{
  "settings": {
    "number_of_shards": 1
  },
  "mappings": {
    "properties": {
      "field1": { "type": "text" }
    }
  }
}
'

this shell can be used to copy setting from source_host to dest_host with new index name, and only copy the index mapping-info

source_host=localhost
dest_host=localhost
source_index_name=tokens
dest_index_name=tokens1234
#jq command can be install with "yum install jq" or "apt install jq"
mapping_json=`curl  "http://$source_host:9200/$source_index_name/_mapping"| jq ".$source_index_name.mappings"`

#create index on dest_host
curl -X PUT http://$dest_host:9200/$dest_index_name
# update index mapping
curl -X PUT $dest_host:9200/$dest_index_name/_mapping?pretty -H 'Content-Type: application/json'   -d @- <<EOF
$mapping_json
EOF

#check the dest_index_name
curl http://$dest_host:9200/$dest_index_name
mrcdb commented 2 years ago

Hi @wxf4150 , can your tool work with ES6 or does it support ES7 only ? I am trying to run the shell script above to copy index mapping but it fails due to the changes in API.

wxf4150 commented 2 years ago

i only test on es7-server. es6 should have more research.

mrcdb commented 2 years ago

Ok, thanks! Nice work with the tool, btw.