Graylog2 / graylog2-server

Free and open log management
https://www.graylog.org
Other
7.41k stars 1.07k forks source link

MapperParsingException #2645

Closed JulioQc closed 8 years ago

JulioQc commented 8 years ago

Expected Behavior

Field is parsed as string rather then numerical.

Current Behavior

Graylog seeing string as numerical. Whole system is broken after wards. MapperParsingException[failed to parse [ThreadID]]; nested: NumberFormatException[For input string: "0D30"];

Possible Solution

Type is correctly recognize as string and this type of error doesn't mess up the journal so bad... Force type in GROK pattern for problematic extractor: {WINDNS_THREADID:ThreadID;string}

Steps to Reproduce (for bugs)

  1. Install and set up this content pack: https://marketplace.graylog.org/addons/b11664ca-1cf5-40bb-a3c1-d230ae9d950d
  2. Wait for the input to catch a non-numerical DNS ThreadID field.
  3. Disk Journal will fail and shit hits the fan until Journal is cleared and deflector cycled

    Context

The GROK pattern for the DNS is this: [a-zA-Z0-9]{4} And from my understanding, Graylog tried to store the 0D30 field as numeric value rather then a string. I report this as a bug because it doesn't make sense for the system to see as a numerical (mistakenly seen as hex maybe?)

[2016-08-09 09:11:00,702][DEBUG][action.bulk              ] [Sabreclaw] [graylog_31][0] failed to execute bulk item (index) index {[graylog_deflector][message][d55cbcb2-5e2e-11e6-9010-005056ae6767], source[{"InternalID":"00000000046C9D30","SndRcv":"Rcv","FileName":"C:\\Windows\\Sysnative\\dns\\dns.log","Opcode":"Q","Time":"8/9/2016 4:43:20 AM","source":"dc2014-01-m.domain.ca","gl2_source_input":"578e59ad0ae2f10b113861de","Name":".bloomberg-12-m.domain.ca.","gl2_source_node":"b0b6b61e-aaab-42a2-af6e-2aabfdb57370","Protocol":"UDP","timestamp":"2016-08-09 08:43:26.000","Context":"PACKET","gl2_source_collector":"ae1187a3-48ae-42bc-a820-7033d7438dbd","SourceModuleType":"im_file","level":6,"IP":"192.168.20.142","streams":[],"message":"8/9/2016 4:43:20 AM 0D30 PACKET  00000000046C9D30 UDP Rcv 192.168.20.142  0eec   Q [0001   D   NOERROR] AAAA   (14)bloomberg-12-m(8)domain(2)ca(0)","EventReceivedTime":"2016-08-09 04:43:26","FlagsHex":"0001","SourceModuleName":"578e3f830ae2f10b113845fe","Response":"NOERROR","XID":"0eec","FlagsChar":"D","ThreadID":"0D30","QType":"AAAA"}]}
2016-08-09_13:11:01.22674 MapperParsingException[failed to parse [ThreadID]]; nested: NumberFormatException[For input string: "0D30"];
2016-08-09_13:11:01.22815       at org.elasticsearch.index.mapper.FieldMapper.parse(FieldMapper.java:329)
2016-08-09_13:11:01.22943       at org.elasticsearch.index.mapper.DocumentParser.parseObjectOrField(DocumentParser.java:309)
2016-08-09_13:11:01.23077       at org.elasticsearch.index.mapper.DocumentParser.parseValue(DocumentParser.java:436)
2016-08-09_13:11:01.23219       at org.elasticsearch.index.mapper.DocumentParser.parseObject(DocumentParser.java:262)
2016-08-09_13:11:01.23351       at org.elasticsearch.index.mapper.DocumentParser.parseDocument(DocumentParser.java:122)
2016-08-09_13:11:01.23593       at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:309)
2016-08-09_13:11:01.23791       at org.elasticsearch.index.shard.IndexShard.prepareIndex(IndexShard.java:580)
2016-08-09_13:11:01.23850       at org.elasticsearch.index.shard.IndexShard.prepareIndexOnPrimary(IndexShard.java:559)
2016-08-09_13:11:01.23965       at org.elasticsearch.action.index.TransportIndexAction.prepareIndexOperationOnPrimary(TransportIndexAction.java:212)
2016-08-09_13:11:01.24144       at org.elasticsearch.action.index.TransportIndexAction.executeIndexRequestOnPrimary(TransportIndexAction.java:224)
2016-08-09_13:11:01.24228       at org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:326)
2016-08-09_13:11:01.24398       at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:119)
2016-08-09_13:11:01.24540       at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:68)
2016-08-09_13:11:01.24692       at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryPhase.doRun(TransportReplicationAction.java:639)
2016-08-09_13:11:01.24774       at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
2016-08-09_13:11:01.24982       at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:279)
2016-08-09_13:11:01.25025       at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:271)
2016-08-09_13:11:01.25147       at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:75)
2016-08-09_13:11:01.25363       at org.elasticsearch.transport.TransportService$4.doRun(TransportService.java:376)
2016-08-09_13:11:01.25440       at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
2016-08-09_13:11:01.25613       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
2016-08-09_13:11:01.25720       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
2016-08-09_13:11:01.25877       at java.lang.Thread.run(Thread.java:745)
2016-08-09_13:11:01.26142 Caused by: java.lang.NumberFormatException: For input string: "0D30"
2016-08-09_13:11:01.26210       at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
2016-08-09_13:11:01.26314       at java.lang.Long.parseLong(Long.java:589)
2016-08-09_13:11:01.26521       at java.lang.Long.parseLong(Long.java:631)

Your Environment

joschi commented 8 years ago

@JulioQc The problem is the dynamic mapping used by Elasticsearch if no explicit mapping is available.

See http://docs.graylog.org/en/2.0/pages/configuration/elasticsearch.html#custom-index-mappings for details.

JulioQc commented 8 years ago

Sure it is related to that but I don't get why this would happen here. Dynamic settings are set as Strings (as I understand it) and the current problematic field is not conflicting.

joschi commented 8 years ago

@JulioQc If the ThreadID field initially contained a numeric value, the Elasticsearch dynamic mapping expects it to stay this way.

Create a custom index mapping for your fields up-front to circumvent this problem.

JulioQc commented 8 years ago

Ok! Crystal clear! Forcing datatype on the Grok pattern seem to do it for now (+ of course deflector cycling). I'll look into custom index mapping since it's by design as well but that seems like lots of work with the number of input we have... Thanks for the guidance!

reighnman commented 8 years ago

This issue is specifically addressed in the readme for the content pack:

https://marketplace.graylog.org/addons/b11664ca-1cf5-40bb-a3c1-d230ae9d950d

"Create an ES template to force the ThreadID field type to "String", otherwise ES may dynamically map the field type as INT which would cause indexing errors later on when an alphanumeric ThreadID comes around."

I also have included an example template to use with your ES indicies

Thanks for filling in @joschi :+1:

JulioQc commented 8 years ago

Yes, I notified them of the scenario

Julien L.

On Aug 10, 2016 11:29 AM, "jro" notifications@github.com wrote:

This issue is specifically addressed in the readme for the content pack:

https://marketplace.graylog.org/addons/b11664ca-1cf5-40bb- a3c1-d230ae9d950d

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Graylog2/graylog2-server/issues/2645#issuecomment-238904216, or mute the thread https://github.com/notifications/unsubscribe-auth/AHJGhEYVMTD-llQE8HDtrO3f4Gc_ANDzks5qee5fgaJpZM4JgInL .