logstash-plugins / logstash-codec-cloudfront

This codec may be used to decode (via inputs) cloudfront gziped files
Apache License 2.0
5 stars 5 forks source link

Error: Object: #Version: 1.0 is not a legal argument to this wrapper, cause it doesn't respond to "read". #2

Open muddman opened 9 years ago

muddman commented 9 years ago

Logstash version 1.5.0.1

I am trying to use the logstash s3 input plugin to download cloudfront logs and the cloudfront codec plugin to filter the stream.

I installed the cloudfront codec with bin/plugin install logstash-codec-cloudfront.

I am getting the following: Error: Object: #Version: 1.0 is not a legal argument to this wrapper, cause it doesn't respond to "read".

Here is the full error message from /var/logs/logstash/logstash.log (redacted)

 {:timestamp=>"2015-08-05T13:35:20.809000-0400", :message=>"A plugin had an unrecoverable error. Will restart this plugin.\n  Plugin: <LogStash::Inputs::S3 bucket=>\"[BUCKETNAME]\", prefix=>\"cloudfront/\", region=>\"us-east-1\", type=>\"cloudfront\", secret_access_key=>\"[SECRETKEY]/1\", access_key_id=>\"[KEYID]\", sincedb_path=>\"/opt/logstash_input/s3/cloudfront/sincedb\", backup_to_dir=>\"/opt/logstash_input/s3/cloudfront/backup\", temporary_directory=>\"/var/lib/logstash/logstash\">\n  Error: Object: #Version: 1.0\n is not a legal argument to this wrapper, cause it doesn't respond to \"read\".", :level=>:error}

My logstash config file: /etc/logstash/conf.d/cloudfront.conf (redacted)

input {
  s3 {
    bucket => "[BUCKETNAME]"
    delete => false
    interval => 60 # seconds
    prefix => "cloudfront/"
    region => "us-east-1"
    type => "cloudfront"
    codec => "cloudfront"
    secret_access_key => "[SECRETKEY]"
    access_key_id => "[KEYID]"
    sincedb_path => "/opt/logstash_input/s3/cloudfront/sincedb"
    backup_to_dir => "/opt/logstash_input/s3/cloudfront/backup"
    use_ssl => true
  }
}

CloudFront logfile from s3 (I only included the header from the file):

 #Version: 1.0
 #Fields: date time x-edge-location sc-bytes c-ip cs-method cs(Host) cs-uri-stem sc-status cs(Referer) cs(User-Agent) cs-uri-query cs(Cookie) x-edge-result-type x-edge-request-id x-host-header cs-protocol cs-bytes time-taken x-forwarded-for ssl-protocol ssl-cipher x-edge-response-result-type

Any ideas? Thanks!!

tomferreira commented 8 years ago

+1

jcoaio commented 8 years ago

+1

DiegoZurita commented 8 years ago

+1

davidhiebert commented 8 years ago

I managed to get this working. Turns out the "cloudfront" codec should be configured on the OUTPUT, not INPUT. The Input codec type should be "plain".

For reference, my plugin crashing error was:

{:timestamp=>"2015-12-01T23:53:07.005000+0000", :message=>"A plugin had an unrecoverable error. Will restart this plugin.\n Plugin: \"YYY\", secret_access_key=>\"XXX\", bucket=>\"causely-logs\", prefix=>\"cdn-stage-admin\", add_field=>{\"Environment\"=>\"stage\", \"appname\"=>\"admin\"}, type=>\"causely-admin\", codec=>\"UTF-8\">, debug=>false, region=>\"us-east-1\", use_ssl=>true, delete=>false, interval=>60, temporary_directory=>\"/var/lib/logstash/logstash\">\n Error: Object: #Version: 1.0\n is not a legal argument to this wrapper, cause it doesn't respond to \"read\".", :level=>:error}

kepstein commented 8 years ago

+1

muddman commented 8 years ago

@davidhiebert -- how did you configure it on the output? Can you give us an example from your conf file? thanks!

I was able to get cloudfront logs into logstash but I'm not using the codec and this seems like a complete hack:

input {
  s3 {
    bucket => "[BUCKET NAME]"
    delete => false
    interval => 60 # seconds
    prefix => "CloudFront/"
    region => "us-east-1"
    type => "cloudfront"
    codec => "plain"
    secret_access_key => "[SECRETKEY]"
    access_key_id => "[KEYID]"
    sincedb_path => "/opt/logstash_input/s3/cloudfront/sincedb"
    backup_to_dir => "/opt/logstash_input/s3/cloudfront/backup"
    use_ssl => true
  }
}

filter {
        if [type] == "cloudfront" {
                if ( ("#Version: 1.0" in [message]) or ("#Fields: date" in [message])) {
                        drop {}
                }

                grok {
                        match => { "message" => "%{DATE_EU:date}\t%{TIME:time}\t%{WORD:x_edge_location}\t(?:%{NUMBER:sc_bytes}|-)\t%{IPORHOST:c_ip}\t%{WORD:cs_method}\t%{HOSTNAME:cs_host}\t%{NOTSPACE:cs_uri_stem}\t%{NUMBER:sc_status}\t%{GREEDYDATA:referrer}\t%{GREEDYDATA:User_Agent}\t%{GREEDYDATA:cs_uri_stem}\t%{GREEDYDATA:cookies}\t%{WORD:x_edge_result_type}\t%{NOTSPACE:x_edge_request_id}\t%{HOSTNAME:x_host_header}\t%{URIPROTO:cs_protocol}\t%{INT:cs_bytes}\t%{GREEDYDATA:time_taken}\t%{GREEDYDATA:x_forwarded_for}\t%{GREEDYDATA:ssl_protocol}\t%{GREEDYDATA:ssl_cipher}\t%{GREEDYDATA:x_edge_response_result_type}" }
                }

                mutate {
                        add_field => [ "received_at", "%{@timestamp}" ]
                        add_field => [ "listener_timestamp", "%{date} %{time}" ]
                }

                date {
                        match => [ "listener_timestamp", "yy-MM-dd HH:mm:ss" ]
                }

                date {
                        locale => "en"
                        timezone => "UCT"
                        match => [ "listener_timestamp", "yy-MM-dd HH:mm:ss" ]
                        target => "@timestamp"
                        add_field => { "debug" => "timestampMatched"}
                }
        }
}
florentb commented 8 years ago

Thank you @muddman. You saved my day !

keshavkaul commented 8 years ago

+1 Still not fixed

jrgns commented 8 years ago

@keshavkaul and @kepstein Do you see this error when using the codec on the input or the output?

jrgns commented 8 years ago

The plugin assumes that the incoming data is gzipped, which causes the error you're seeing. This should be fixable.

jrgns commented 8 years ago

I'll merge the proposed fix #3 as soon as anyone experiencing the error can confirm that it solves their problem.

jrgns commented 8 years ago

Ok, here's a full writeup of what's happening.

The error occurs because instead of receiving a gzipped stream, the codec is receiving a decompressed stream. Specifically in @muddman 's case it's because the S3 input automatically decompresses gzipped files.

This codec does two things

  1. Decompress the gzipped files
  2. Adds the cloudfront meta data: version and extracted fields

If you're not worried about number 2, just use the S3 plugin and use the plain codec.

If you are worried about number 2, please check if #3 solves your problem.

keshavkaul commented 8 years ago

@jrgns I was using this codec with a file plugin. I initially tried using this setup with gzipped files, but saw an error. Then i unzipped the files and saw this error. I'll try it again next week.

BTW.. @davidhiebert thanks for the initial push :+1:

luizcab commented 7 years ago

Does this codec work at all?

Sorry if I sound dumb but I just couldn't build a config file that reads a cloudfront gziped file and outputs a json with filed_name: field.

Cheers!

saez0pub commented 2 years ago

Seriously it's broken, I can't use this codec on the s3 input

soumyajk commented 1 year ago

+1

wangjia007bond commented 1 year ago

I can't use this codec on the s3 input