vectordotdev / vector

A high-performance observability data pipeline.
https://vector.dev
Mozilla Public License 2.0
17.35k stars 1.51k forks source link

Logdna Error Parsing Logs: Line parsing failed for this line: [_meta.image] not object. #3149

Open surajthakur opened 4 years ago

surajthakur commented 4 years ago

I am getting an error in LogDNA when I am sending logs through vectordev.

I am running vectordev as system job on Nomad(orchestration tool) which collects logs from specific containers based upon labels running on each host.

The logs are being sent to LogDNA but LogDNA does not parse it(could you due to invalid format or content). After running a while vectordev also crashes reporting the parsing error.

Error on LogDNA:

image

Error received on vectorDev

Jul 21 23:54:09.695 DEBUG tokio_reactor::registration: scheduling Read for: 5
Jul 21 23:54:09.695 DEBUG hyper::proto::h1::io: flushed 3149 bytes
Jul 21 23:54:09.706 DEBUG hyper::proto::h1::io: read 0 bytes
Jul 21 23:54:09.706 DEBUG hyper::proto::h1::conn: parse error (connection closed before message completed) with 0 bytes
Jul 21 23:54:09.706 DEBUG hyper::proto::h1::dispatch: read_head error: connection closed before message completed
Jul 21 23:54:09.707 DEBUG tokio_reactor: dropping I/O source: 5
Jul 21 23:54:09.707 ERROR sink{name=my_sink_id_2 type=logdna}:request{request_id=1}: vector::sinks::util::retries: encountered non-retriable error. error=connection closed before message completed
Jul 21 23:54:09.707 ERROR sink{name=my_sink_id_2 type=logdna}: vector::sinks::util::sink: Request failed. error=connection closed before message completed
Jul 21 23:54:09.707 ERROR sink{name=my_sink_id_2 type=logdna}: vector::sinks::logdna: Fatal logdna sink error: connection closed before message completed
Jul 21 23:54:09.713 ERROR vector::topology: Unhandled error
Jul 21 23:54:09.713 DEBUG tokio_reactor::registration: scheduling Read for: 3
Jul 21 23:54:09.713 DEBUG tokio_reactor::registration: scheduling Read for: 2
Jul 21 23:54:09.713 DEBUG tokio_reactor::registration: scheduling Read for: 1
Jul 21 23:54:09.713 DEBUG tokio_reactor::registration: scheduling Read for: 0
Jul 21 23:54:09.713 INFO vector: Shutting down

VectorDev Config:

data_dir = "/var/lib/vector"

[sources.my_source_id]
  type = "docker"
  #auto_partial_merge = true # optional, default
  include_labels = ["logdna=true"] # optional, no default
  #partial_event_marker_field = "_partial" # optional, default

[transforms.my_transform_id]
  # General
  type = "add_fields" # required
  inputs = ["my_source_id"] # required
  fields.app = "{{ label.job }}"

[sinks.my_sink_id_2]
  type = "logdna" # required
  inputs = ["my_transform_id"] # required
  api_key = "1234567890" # required
  healthcheck = true # optional, default
  hostname = "local" # required
  tags = ["staging"]
ktff commented 4 years ago

but LogDNA does not parse it

This is probably the main issue.

@surajthakur could you upload higher resolution picture of error on LogDNA, or just retype it here, this one is too small to recognize what's being said.

wang502 commented 4 years ago

I think it's very likely due to an inconsistent type of _meta.image field. As per LogDNA's HTTP API spec: https://docs.logdna.com/reference#logsingest:

If inconsistent value types are used, that line's metadata, will not be parsed. For example, if a line is passed with a meta object, such as meta.myfield of type String, any subsequent lines with meta.myfield must have a String as the value type for meta.myfield.

It sort of explains why some lines are parsed properly on LogDNA, and some are not. Type of "_meta.image" has changed somehow.

binarylogic commented 4 years ago

Thanks @wang502, that makes sense. We are exploring ways for Vector to solve this problem, such as bookkeeping field types and attempting to coerce them before transmission (ex: 2 -> "2"). It's likely we'll solve this manually with read/write schemas before we attempt that.

wang502 commented 4 years ago

Why did type change happen in the first place? Format of Docker logs should already be consistent.

binarylogic commented 4 years ago

The image field itself hasn't changed types to my knowledge. Do you have an example payload where it is not a string? That would help us understand what's going on.

surajthakur commented 4 years ago

but LogDNA does not parse it

This is probably the main issue.

@surajthakur could you upload higher resolution picture of error on LogDNA, or just retype it here, this one is too small to recognize what's being said.

vector @ktff PFA image.

These are logs from haproxy, but I was getting same issue with other applications as well.

I dont understand why lognda is parsing it as META and not JSON. I have many other logs which are coming from a file and they are considered as JSON

surajthakur commented 4 years ago

I have tried printing out the body of the request it sends to logdna, this should look like this

{
   "app":"haproxy-e2353fe",
   "line":"172.17.0.1:34186 [23/Jul/2020:11:32:57.953] lb_01 lb_01/<NOSRV> -1/-1/-1/-1/0 503 221 - - SC-- 1/1/0/0/0 0/0 \"GET / HTTP/1.1\"",
   "meta":{
      "container_created_at":"2020-07-23T01:31:34.602836767Z",
      "container_id":"a7b43c444f8e1b9954f3a8d5ebccfd9f419fedcdf6e0c7ecb3d529f38a7952a0",
      "container_name":"haproxy-e2353fe",
      "image":"myrepo/test:haproxy_r2.1.4_0.0.2",
      "label":{
         "com":{
            "hashicorp":{
               "nomad":{
                  "alloc_id":"e2353fe"
               }
            }
         },
         "job":"haproxy",
         "logdna":"true"
      },
      "source_type":"docker",
      "stream":"stdout"
   },
   "timestamp":"2020-07-23T01:32:57.954784321Z"
}
ktff commented 4 years ago

Thanks @surajthakur, this fits the issue @wang502 mentioned.