logstash-plugins / logstash-output-s3

Apache License 2.0
58 stars 151 forks source link

After the file is uploaded to s3, the content is wrong, is it an encoding problem? #242

Open shaojielinux opened 2 years ago

shaojielinux commented 2 years ago

After the file is uploaded to s3, the content is wrong, is it an encoding problem? input { kafka { bootstrap_servers => "10.88.14.172:9092,10.88.6.9:9092,10.88.10.166:9092"

topics => ["k8s"]

group_id => "k8s1"

topics => ["fluent-log"]

topics_pattern => ".*"

group_id => "k8s2"

key_deserializer_class => "org.apache.kafka.common.serialization.ByteArrayDeserializer"

value_deserializer_class => "org.apache.kafka.common.serialization.ByteArrayDeserializer"

consumer_threads => 4 decorate_events => true auto_offset_reset => "latest" codec => "json" } } output { s3{ access_key_id => "" secret_access_key => "" region => "" bucket => "" prefix => "%{[container_name]}/%{+YYYY}/%{+MM}/%{+dd}" temporary_directory => "/tmp/logstash_s3" encoding => "gzip" codec => "plain" rotation_strategy => "time" time_file => 5 } }

This is my configuration

s3 file content 2021-12-16T03:59:36.096Z %{host} %{message}2021-12-16T03:59:36.095Z %{host} %{message}2021-12-16T03:59:36.095Z %{host} %{message}2021-12-16T03:59:36.096Z %{host} %{message}2021-12-16T03:59:36.096Z %{host} %{message}2021-12-16T03:59:36.096Z %{host} %{message}2021-12-16T03:59:36.096Z %{host} %{message}2021-12-16T03:59:37.096Z %{host} %{message}2021-12-16T03:59:37.097Z %{host} %{message}2021-12-16T03:59:37.096Z %{host} %{message}2021-12-16T03:59:37.096Z %{host} %{message}2021-12-16T03:59:37.096Z %{host} %{message}2021-12-16T03:59:37.097Z %{host} %{message}2021-12-16T03:59:37.096Z %{host} %{message}2021-12-16T03:59:37.096Z %{host} %{message}2021-12-16T03:59:37.096Z %{host} %{message}2021-12-16T03:59:37.097Z %{host} %{message}2021-12-16T03:59:38.096Z %{host} %{message}2021-12-16T03:59:38.096Z %{host} %{message}2021-12-16T03:59:38.096Z %{host} %{message}2021-12-16T03:59:38.095Z %{host} %{message}2021-12-16T03:59:38.096Z %{host} %{message}2021-12-16T03:59:38.096Z %{host} %{message}2021-12-16T03:59:38.096Z %{host} %{message}2021-12-16T03:59:38.095Z %{host} %{message}2021-12-16T03:59:38.096Z %{host} %{message}2021-12-16T03:59:38.096Z %{host} %{message}2021-12-16T03:59:39.096Z %{host} %{message}2021-12-16T03:59:39.096Z %{host} %{message}2021-12-16T03:59:39.096Z %{host} %{message}2021-12-16T03:59:39.096Z %{host} %{message}2021-12-16T03:59:39.095Z %{host} %{message}2021-12-16T03:59:39.096Z %{host} %{message}2021-12-16T03:59:39.096Z %{host} %{message}2021-12-16T03:59:39.095Z %{host} %{message}2021-12-16T03:59:39.096Z %{host} %{message}2021-12-16T03:59:39.097Z %{host} %{message}2021-12-16T03:59:40.096Z %{host} %{message}2021-12-16T03:59:40.096Z %{host} %{message}2021-12-16T03:59:40.096Z %{host} %{message}2021-12-16T03:59:40.096Z %{host} %{message}2021-12-16T03:59:40.096Z %{host} %{message}

roimor commented 2 years ago

I am having the same issue. @shaojielinux did you find a workaround ?

yaauie commented 2 years ago

You appear to be using codec => plain in your output plugin. An output codec is responsible for converting a sequence of structured events into a sequence of bytes, and the Plain Codec is probably not what you are looking for.

If you want the events to be appended as-structured to a newline-delimited JSON stream, you likely should be using codec => json_lines.