moshe / elasticsearch_loader

A tool for batch loading data files (json, parquet, csv, tsv) into ElasticSearch
MIT License
399 stars 83 forks source link

Rejecting mapping update to [audits] as the final mapping would have more than 1 type #82

Closed molinto closed 4 years ago

molinto commented 4 years ago

Command:

elasticsearch_loader --index-settings-file audit_mapping.json --index audits --http-auth elastic:PASSWORD --type audit csv audit.csv

MAPPING FILE (audit_mapping.json):

{
  "settings" : {
  },
  "mappings": {
    "properties": {
      "@timestamp": {
        "type": "date"
      },
      "module": {
        "type": "keyword"
      },
      "action": {
        "type": "keyword"
      },
      "occured_at": {
        "type": "date",
        "format": "yyyy-MM-dd HH:mm:ss"
      }
    }
  }
}

CSV File (audit.csv_:

"id";"module";"action";"description";"level";"occured_at";"created_user";"createdAt";"updatedAt"
"001677f8-6e03-4ea2-bc5c-442e19477189";"firewall";"activated";"Local activated on lan6";0;"2019-03-08 13:24:11";"";"2019-03-08 13:24:11";"2019-03-08 13:24:11"
"001e28e3-8f3a-40a1-9d74-2f1720f9612d";"analysis";"initialising";"Ready.";0;"2019-03-20 16:15:46";"";"2019-03-20 16:15:46";"2019-03-20 16:15:46"
"0022d1da-398e-4026-ac58-899201fb36e9";"AuthorisationService";"Login";"Logging in ggreen user";0;"2019-03-19 14:52:54";"";"2019-03-19 14:52:54";"2019-03-19 14:52:54"
"0044084a-a749-4aa6-af05-4deba6a38432";"UserService";"UnBlock";"UnBlock user: ppurple";0;"2019-03-19 15:37:31";"ad223ccd-b6f9-4bd8-868d-4fae2e95d9a2";"2019-03-19 15:37:31";"2019-03-19 15:37:31"

Errors:

elasticsearch_loader --index-settings-file integra_audit_mapping.json --index audits --http-auth elastic:rHwdTwdvfGxKlhyApO34 --type audit csv integra_audit2.csv
{'index_settings_file': <_io.BufferedReader name='integra_audit_mapping.json'>, 'index': 'audits', 'http_auth': 'elastic:rHwdTwdvfGxKlhyApO34', 'type': 'audit', 'bulk_size': 500, 'es_host': ('http://localhost:9200',), 'verify_certs': False, 'use_ssl': False, 'ca_certs': None, 'delete': False, 'update': False, 'progress': False, 'id_field': None, 'as_child': False, 'with_retry': False, 'timeout': 10.0, 'encoding': 'utf-8', 'keys': [], 'es_conn': <Elasticsearch([{'host': 'localhost', 'port': 9200}])>}
2020-01-30 15:07:40.873155 ERROR attempt [1/1] got exception, it is a permanent data loss, no retry any more
2020-01-30 15:07:40.873255 WARN Chunk 0 got exception (('4 document(s) failed to index.', [{'index': {'_index': 'audits', '_type': 'audit', '_id': 'lXT-9m8BOz4WhPcWVe2O', 'status': 400, 'error': {'type': 'illegal_argument_exception', 'reason': 'Rejecting mapping update to [audits] as the final mapping would have more than 1 type: [_doc, audit]'}, 'data': OrderedDict([('id;"module";"action";"description";"level";"occured_at";"created_user";"createdAt";"updatedAt"', '001677f8-6e03-4ea2-bc5c-442e19477189;"firewall";"activated";"Local activated on lan6";0;"2019-03-08 13:24:11";"";"2019-03-08 13:24:11";"2019-03-08 13:24:11"')])}}, {'index': {'_index': 'audits', '_type': 'audit', '_id': 'lnT-9m8BOz4WhPcWVe2O', 'status': 400, 'error': {'type': 'illegal_argument_exception', 'reason': 'Rejecting mapping update to [audits] as the final mapping would have more than 1 type: [_doc, audit]'}, 'data': OrderedDict([('id;"module";"action";"description";"level";"occured_at";"created_user";"createdAt";"updatedAt"', '001e28e3-8f3a-40a1-9d74-2f1720f9612d;"analysis";"initialising";"Ready.";0;"2019-03-20 16:15:46";"";"2019-03-20 16:15:46";"2019-03-20 16:15:46"')])}}, {'index': {'_index': 'audits', '_type': 'audit', '_id': 'l3T-9m8BOz4WhPcWVe2O', 'status': 400, 'error': {'type': 'illegal_argument_exception', 'reason': 'Rejecting mapping update to [audits] as the final mapping would have more than 1 type: [_doc, audit]'}, 'data': OrderedDict([('id;"module";"action";"description";"level";"occured_at";"created_user";"createdAt";"updatedAt"', '0022d1da-398e-4026-ac58-899201fb36e9;"AuthorisationService";"Login";"Logging in ggreen user";0;"2019-03-19 14:52:54";"";"2019-03-19 14:52:54";"2019-03-19 14:52:54"')])}}, {'index': {'_index': 'audits', '_type': 'audit', '_id': 'mHT-9m8BOz4WhPcWVe2O', 'status': 400, 'error': {'type': 'illegal_argument_exception', 'reason': 'Rejecting mapping update to [audits] as the final mapping would have more than 1 type: [_doc, audit]'}, 'data': OrderedDict([('id;"module";"action";"description";"level";"occured_at";"created_user";"createdAt";"updatedAt"', '0044084a-a749-4aa6-af05-4deba6a38432;"UserService";"UnBlock";"UnBlock user: ppurple";0;"2019-03-19 15:37:31";"ad223ccd-b6f9-4bd8-868d-4fae2e95d9a2";"2019-03-19 15:37:31";"2019-03-19 15:37:31"')])}}])) while processing
Traceback (most recent call last):
  File "/home/sharry/.local/bin/elasticsearch_loader", line 11, in <module>
    load_entry_point('elasticsearch-loader==0.6.0', 'console_scripts', 'elasticsearch_loader')()
  File "/home/sharry/.local/lib/python3.7/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/home/sharry/.local/lib/python3.7/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/home/sharry/.local/lib/python3.7/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/sharry/.local/lib/python3.7/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/sharry/.local/lib/python3.7/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/home/sharry/.local/lib/python3.7/site-packages/click/decorators.py", line 17, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/home/sharry/.local/lib/python3.7/site-packages/elasticsearch_loader/__init__.py", line 134, in _csv
    load(lines, ctx.obj)
  File "/home/sharry/.local/lib/python3.7/site-packages/elasticsearch_loader/__init__.py", line 53, in load
    single_bulk_to_es(bulk, config, config['with_retry'])
  File "/home/sharry/.local/lib/python3.7/site-packages/elasticsearch_loader/__init__.py", line 37, in single_bulk_to_es
    raise e
  File "/home/sharry/.local/lib/python3.7/site-packages/elasticsearch_loader/__init__.py", line 28, in single_bulk_to_es
    helpers.bulk(config['es_conn'], bulk, chunk_size=config['bulk_size'])
  File "/home/sharry/.local/lib/python3.7/site-packages/elasticsearch/helpers/actions.py", line 304, in bulk
    for ok, item in streaming_bulk(client, actions, *args, **kwargs):
  File "/home/sharry/.local/lib/python3.7/site-packages/elasticsearch/helpers/actions.py", line 234, in streaming_bulk
    **kwargs
  File "/home/sharry/.local/lib/python3.7/site-packages/elasticsearch/helpers/actions.py", line 162, in _process_bulk_chunk
    raise BulkIndexError("%i document(s) failed to index." % len(errors), errors)
elasticsearch.helpers.errors.BulkIndexError: ('4 document(s) failed to index.', [{'index': {'_index': 'audits', '_type': 'audit', '_id': 'lXT-9m8BOz4WhPcWVe2O', 'status': 400, 'error': {'type': 'illegal_argument_exception', 'reason': 'Rejecting mapping update to [audits] as the final mapping would have more than 1 type: [_doc, audit]'}, 'data': OrderedDict([('id;"module";"action";"description";"level";"occured_at";"created_user";"createdAt";"updatedAt"', '001677f8-6e03-4ea2-bc5c-442e19477189;"firewall";"activated";"Local activated on lan6";0;"2019-03-08 13:24:11";"";"2019-03-08 13:24:11";"2019-03-08 13:24:11"')])}}, {'index': {'_index': 'audits', '_type': 'audit', '_id': 'lnT-9m8BOz4WhPcWVe2O', 'status': 400, 'error': {'type': 'illegal_argument_exception', 'reason': 'Rejecting mapping update to [audits] as the final mapping would have more than 1 type: [_doc, audit]'}, 'data': OrderedDict([('id;"module";"action";"description";"level";"occured_at";"created_user";"createdAt";"updatedAt"', '001e28e3-8f3a-40a1-9d74-2f1720f9612d;"analysis";"initialising";"Ready.";0;"2019-03-20 16:15:46";"";"2019-03-20 16:15:46";"2019-03-20 16:15:46"')])}}, {'index': {'_index': 'audits', '_type': 'audit', '_id': 'l3T-9m8BOz4WhPcWVe2O', 'status': 400, 'error': {'type': 'illegal_argument_exception', 'reason': 'Rejecting mapping update to [audits] as the final mapping would have more than 1 type: [_doc, audit]'}, 'data': OrderedDict([('id;"module";"action";"description";"level";"occured_at";"created_user";"createdAt";"updatedAt"', '0022d1da-398e-4026-ac58-899201fb36e9;"AuthorisationService";"Login";"Logging in ggreen user";0;"2019-03-19 14:52:54";"";"2019-03-19 14:52:54";"2019-03-19 14:52:54"')])}}, {'index': {'_index': 'audits', '_type': 'audit', '_id': 'mHT-9m8BOz4WhPcWVe2O', 'status': 400, 'error': {'type': 'illegal_argument_exception', 'reason': 'Rejecting mapping update to [audits] as the final mapping would have more than 1 type: [_doc, audit]'}, 'data': OrderedDict([('id;"module";"action";"description";"level";"occured_at";"created_user";"createdAt";"updatedAt"', '0044084a-a749-4aa6-af05-4deba6a38432;"UserService";"UnBlock";"UnBlock user: ppurple";0;"2019-03-19 15:37:31";"ad223ccd-b6f9-4bd8-868d-4fae2e95d9a2";"2019-03-19 15:37:31";"2019-03-19 15:37:31"')])}}])
molinto commented 4 years ago

COMMAND:

elasticsearch_loader --index-settings-file audit_mapping.json  --index audits --http-auth elastic:rPASSWORD --type _doc csv audit.csv

MAPPING FILE (audit_mapping.json):

{
  "settings" : {
  },
  "mappings": {
    "properties": {
      "id": {
        "type": "keyword"
      },
      "module": {
        "type": "keyword"
      },
      "action": {
        "type": "keyword"
      },
      "occured_at": {
        "type": "date",
        "format": "yyyy-MM-dd HH:mm:ss"
      }
    }
  }
}

Seems to make this in the audit - index management in Kibana:

{
  "mapping": {
    "properties": {
      "action": {
        "type": "keyword"
      },
      "id": {
        "type": "keyword"
      },
      "id;\"module\";\"action\";\"description\";\"level\";\"occured_at\";\"created_user\";\"createdAt\";\"updatedAt\"": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword",
            "ignore_above": 256
          }
        }
      },
      "module": {
        "type": "keyword"
      },
      "occured_at": {
        "type": "date",
        "format": "yyyy-MM-dd HH:mm:ss"
      }
    }
  }
}

Hoping --keys would get around this, but doesn't seem to do anything. Not sure if that's normal

molinto commented 4 years ago

This works though :)

elasticsearch_loader --index-settings-file audit_mapping.json --index audits --http-auth elastic:PASSWORD --keys id,action,module,occured_at --type _doc csv audit.csv

molinto commented 4 years ago

Maybe not, it just populates the id in ElasticSearch, all other fields are empty :(

molinto commented 4 years ago

Any ideas please @moshe

molinto commented 4 years ago

Tried with a different dataset:

elasticsearch_loader --index signals --http-auth elastic:PASSWORD csv signals.csv

RESULT: Imports with id column & ts'iface;snr;bar columns


elasticsearch_loader --index signals --http-auth elastic:PASSWORD --keys ts,snr,iface,bars csv signals.csv

RESULTS: Imports only the id column


elasticsearch_loader --index-settings-file signals_mapping.json --index signals --http-auth elastic:PASSWORD csv signals.csv

RESULT: Imports with id column & ts'iface;snr;bar column & the other fields but are empty


signals.csv

"ts";"iface";"snr";"bars"
"2019-03-01 02:16:49";"wwan0";-62.0;5
"2019-03-01 02:25:22";"wwan0";-56.0;5
"2019-03-01 02:26:09";"wwan0";-62.0;5
"2019-03-01 02:36:46";"wwan0";-56.0;5
"2019-03-01 02:36:52";"wwan0";-62.0;5
"2019-03-01 02:37:02";"wwan0";-55.0;5
"2019-03-01 02:37:07";"wwan0";-61.0;5
"2019-03-01 02:44:52";"wwan0";-55.0;5
"2019-03-01 02:44:57";"wwan0";-61.0;5
"2019-03-01 02:46:47";"wwan0";-55.0;5
"2019-03-01 02:46:58";"wwan0";-61.0;5

signal_mapping.json:

{
  "settings" : {
  },
  "mappings": {
    "properties": {
     "@timestamp": {
        "type": "date"
      },
      "snr": {
        "type": "float"
      },
      "iface": {
        "type": "keyword"
      },
      "bars": {
        "type": "byte"
      },
      "ts": {
        "type": "date",
        "format": "yyyy-MM-dd HH:mm:ss"
      }
    }
  }
}
moshe commented 4 years ago

I see that your data delimited by ;, you need to add --delimiter ';'

molinto commented 4 years ago

Thanks @moshe.

Couldn't find that in the documentation, tried it & get a 'Error: no such option: --delimiter'

moshe commented 4 years ago

Can you paste the exact commad? --delimiter should come AFTER the csv command elasticsearch_loader --index ... csv --delimiter ';' cake.csv

molinto commented 4 years ago

This works!

elasticsearch_loader --index-settings-file signal_mapping.json --index signals --http-auth elastic:PASSWORD --keys ts,snr,iface,bars --type _doc csv --delimiter ';' signal.csv

Thank you for your help