elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
69.57k stars 24.62k forks source link

MapperParsingException and field mapping conflict all out of the sudden #18917

Closed defkev closed 8 years ago

defkev commented 8 years ago

Elasticsearch version:

{ "name" : "Alpha Ray", "cluster_name" : "elasticsearch", "version" : { "number" : "2.3.3", "build_hash" : "218bdf10790eef486ff2c41a3df5cfa32dadcfde", "build_timestamp" : "2016-05-17T15:40:04Z", "build_snapshot" : false, "lucene_version" : "5.5.0" }, "tagline" : "You Know, for Search" }

JVM version:

openjdk version "1.8.0_91" OpenJDK Runtime Environment (build 1.8.0_91-b14) OpenJDK 64-Bit Server VM (build 25.91-b14, mixed mode)

OS version:

CentOS Linux release 7.2.1511 (Core)

Description of the problem including expected versus actual behavior: One of my syslog inputs from logstash stopped working this morning exactly at 2:00 am Elasticsearch didn't add any new shards for this very type past this point but started throwing a MapperParsingException in elasticsearch.log for every new shard like:

MapperParsingException[failed to parse [timestamp]]; nested: IllegalArgumentException[Invalid format: "Jun 16 02:00:28"];
        at org.elasticsearch.index.mapper.FieldMapper.parse(FieldMapper.java:329)
        at org.elasticsearch.index.mapper.DocumentParser.parseObjectOrField(DocumentParser.java:309)
        at org.elasticsearch.index.mapper.DocumentParser.parseAndMergeUpdate(DocumentParser.java:738)
        at org.elasticsearch.index.mapper.DocumentParser.parseDynamicValue(DocumentParser.java:625)
        at org.elasticsearch.index.mapper.DocumentParser.parseValue(DocumentParser.java:442)
        at org.elasticsearch.index.mapper.DocumentParser.parseObject(DocumentParser.java:262)
        at org.elasticsearch.index.mapper.DocumentParser.parseDocument(DocumentParser.java:122)
        at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:309)
        at org.elasticsearch.index.shard.IndexShard.prepareCreate(IndexShard.java:529)
        at org.elasticsearch.index.shard.IndexShard.prepareCreateOnPrimary(IndexShard.java:506)
        at org.elasticsearch.action.index.TransportIndexAction.prepareIndexOperationOnPrimary(TransportIndexAction.java:215)
        at org.elasticsearch.action.index.TransportIndexAction.executeIndexRequestOnPrimary(TransportIndexAction.java:224)
        at org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:326)
        at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:119)
        at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:68)
        at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryPhase.doRun(TransportReplicationAction.java:639)
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
        at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:279)
        at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:271)
        at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:75)
        at org.elasticsearch.transport.TransportService$4.doRun(TransportService.java:376)
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalArgumentException: Invalid format: "Jun 16 02:00:28"
        at org.joda.time.format.DateTimeParserBucket.doParseMillis(DateTimeParserBucket.java:187)
        at org.joda.time.format.DateTimeFormatter.parseMillis(DateTimeFormatter.java:780)
        at org.elasticsearch.index.mapper.core.DateFieldMapper$DateFieldType.parseStringValue(DateFieldMapper.java:362)
        at org.elasticsearch.index.mapper.core.DateFieldMapper.innerParseCreateField(DateFieldMapper.java:528)
        at org.elasticsearch.index.mapper.core.NumberFieldMapper.parseCreateField(NumberFieldMapper.java:241)
        at org.elasticsearch.index.mapper.FieldMapper.parse(FieldMapper.java:321)
        ... 24 more

Note that this is the timestamp field from syslog, not the @timestamp added by elasticsearch. Furthermore the field in question now shows a mapping conflict in kibana, it was previously mapped as a string and the last shard created was at 1:59:36 am

Now as i see it elasticsearch is suddenly trying to parse the said timestamp field as an actual timestamp instead of a string (which cannot be parsed as such since it is missing the year part i suppose) which in return is creating the mapping conflict.

The setup has been running for a while by now and nothing on the configuration has been changed recently, hence i am curious what has caused elasticsearch to thread "Jun 16 01:59:36" as a string and "Jun 16 02:00:28" as a timestamp.

defkev commented 8 years ago

Never mind. After some further digging it looks this is to blame on one of my log sources which is sending its json formatted message with a timestamp (instead of a @timestamp) field populated in ISO8601

2016-06-16T14:59:41.828214+0200

If such a message is the first to hit logstash after elasticsearch creates a new index at 2:00 am the timestamp field will of course get mapped as a date instead of a string. This in return is creating the MapperParsingException and field mapping conflict once one of my other log sources, which are all either not sending a timestamp field in their json formated messages (or are sending a @timestamp in ISO8601) causing elastic to use the syslog time for the timestamp field which is not in ISO8601 but syslogs rfc3164 default format

Jun 16 01:59:36

tl;dr: Mapping a date as a string is not a problem, mapping a string as a date is.

Guess i was just lucky in the past this never happened before during the daily index rotate :hankey: