we have use case where we want to read from the Elasticsearch and write into HDFS. for this we are using WebHdfs output plugin in Logstash. below is our logstash config for reference.
It's the question about how to use Logstash, or an issue about logstash output plugin for webhdfs (if the feature is missing). It's not about this gem. See other repository or discussion forums.
Hi,
we have use case where we want to read from the Elasticsearch and write into HDFS. for this we are using WebHdfs output plugin in Logstash. below is our logstash config for reference.
input { elasticsearch { hosts => "192.168.0.3" index => "test" query => '{"query": {"term": {"Name": "test"}}}' size => 500 scroll => "5m" } } output { webhdfs { host => "192.168.0.2" port => 50070 # (required) path => "/user/logstash/test1" # (required) user => "hdfs" # (required) flush_size => 500 idle_flush_time => 10 retry_interval => 10 codec => json } }
which is working fine for us. now we have a requirement where we want to write output in parquet/avro format in HDFS.
is there any config parameter by which we can write data in HDFS in Avro or Parquet format.