logstash-plugins / logstash-input-rss

RSS input for Logstash
Apache License 2.0
15 stars 23 forks source link

rss input is creating objects that can't be serialized #21

Open rashmivkulkarni opened 8 years ago

rashmivkulkarni commented 8 years ago
input {
  rss {
    url => "http://stackoverflow.com/feeds/tag/elasticsearch+or+logstash+or+beats+or+kibana"
    interval => 3600
  }
}
output {
 elasticsearch {
    hosts => ["localhost:9200"]
    user => "elastic"
    password => "changeme"
  }
  stdout { }
}

1) When I run Logstash with the above config file, I get the following error as mentioned in the stack. This seems to be a this is a rss input specific problem.

``An unknown error occurred sending a bulk request to Elasticsearch. We will retry indefinitely {:error_message=>"Failed to load class 'org.joda.time.DateTime$Access4JacksonDeserializer6e1d0360': com.fasterxml.jackson.module.afterburner.ser.BeanPropertyAccessor", :error_class=>"LogStash::Json::GeneratorError", :backtrace=>["/Users/rashmikulkarni1/Documents/ALPHA4FINAL/logstash/logstash-5.0.0-alpha4/logstash-core/lib/logstash/json.rb:52:injruby_dump'", "/Users/rashmikulkarni1/Documents/ALPHA4FINAL/logstash/logstash-5.0.0-alpha4/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-4.1.1-java/lib/logstash/outputs/elasticsearch/http_client.rb:49:in bulk'", "org/jruby/RubyArray.java:1613:ineach'", "org/jruby/RubyEnumerable.java:852:in inject'", "/Users/rashmikulkarni1/Documents/ALPHA4FINAL/logstash/logstash-5.0.0-alpha4/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-4.1.1-java/lib/logstash/outputs/elasticsearch/http_client.rb:38:inbulk'", "/Users/rashmikulkarni1/Documents/ALPHA4FINAL/logstash/logstash-5.0.0-alpha4/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-4.1.1-java/lib/logstash/outputs/elasticsearch/common.rb:181:in safe_bulk'", "/Users/rashmikulkarni1/Documents/ALPHA4FINAL/logstash/logstash-5.0.0-alpha4/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-4.1.1-java/lib/logstash/outputs/elasticsearch/common.rb:105:insubmit'",

colinsurprenant commented 8 years ago

This kind of problem typically happen because the plugin, which relies on an external library (rss), sets a "rogue" object in the Event. By convention, only strings and numeric values (and logstash Timestamp) are valid but not currently validated/enforced upon setting values so the rogue object may survive the pipeline travel until it has to be serialized where logstash cannot figure how to serialize such an object. In this case this is a org.joda.time.DateTime.

Quickly looking at the code I am pretty sure the problem lies here and/or here - these values are probably some kind of date object. This is where I'd add some test harness to validate the Object type which is returned by the rss library and set into the Event.

Note that in the future, these kind of problems will surface immediately at the input plugin where the "bad" object is set and not at some other pipeline stage later on. This also means that such problems will be discovered by any basic plugin test/spec and/or manual tests in the development process.