fluent / fluent-bit

Fast and Lightweight Logs and Metrics processor for Linux, BSD, OSX and Windows
https://fluentbit.io
Apache License 2.0
5.91k stars 1.59k forks source link

fluent-bit out_http msgpack format is not compatible with fluentd in_http #709

Open ekesken opened 6 years ago

ekesken commented 6 years ago

I'm trying to send logs from fluentbit (0.13) via out_http to fluentd(v1.2.4)'s in_http source, it works with "format json" setting, but when I switched to "format msgpack", I'm getting this error:

500 Internal Server Error
undefined method `[]=' for 1533737116:Fixnum
Did you mean? []

the error is thrown here: https://github.com/fluent/fluentd/blob/master/lib/fluent/plugin/in_http.rb#L230

record got from fluentbit is so:

record=[2018-08-08 14:35:20.002109900 +0000, {"cpu_p"=>1.7500000000000002, "user_p"=>1.0, "system_p"=>0.75, "cpu0.p_cpu"=>0.0, "cpu0.p_user"=>0.0, "cpu0.p_system"=>0.0, "cpu1.p_cpu"=>5.0, "cpu1.p_user"=>3.0, "cpu1.p_system"=>2.0, "cpu2.p_cpu"=>1.0, "cpu2.p_user"=>0.0, "cpu2.p_system"=>1.0, "cpu3.p_cpu"=>1.0, "cpu3.p_user"=>1.0, "cpu3.p_system"=>0.0}]

if I choose "format json", the record looks like so:

record=[{"date"=>1533795317.001521, "cpu_p"=>2.25, "user_p"=>0.5, "system_p"=>1.75, "cpu0.p_cpu"=>1.0, "cpu0.p_user"=>0.0, "cpu0.p_system"=>1.0, "cpu1.p_cpu"=>6.0, "cpu1.p_user"=>1.0, "cpu1.p_system"=>5.0, "cpu2.p_cpu"=>1.0, "cpu2.p_user"=>0.0, "cpu2.p_system"=>1.0, "cpu3.p_cpu"=>1.0, "cpu3.p_user"=>1.0, "cpu3.p_system"=>0.0}]

as I understand from here: https://github.com/fluent/fluent-bit/blob/master/plugins/out_http/http.c#L435 fluentbit directly sends what it got from input plugin to fluentd if it's msgpack, but if it's json, it first converts msgpack to json, get first item in array as time, and injects it to record as date field.

because of this behavioural difference, in fluentd side it gets timestamp 2018-08-08 14:35:20.002109900 +0000 as single_record, and fails here: https://github.com/fluent/fluentd/blob/master/lib/fluent/plugin/in_http.rb#L204 with undefined method[]=' for 1533737116:Fixnum`

it seems very fair to expect fluentbit out_http to work with fluentd in_http via msgpack format, but I couldn't find people complaining about the incompatibility. am I wrong? I'd be glad if somebody can verify the existence of this incompatibility.

since behavioural difference is caused by fluentbit, it seems the best thing to do is adding a new format type like msgpack_date_in_record in fluentbit side for out_http.

edsiper commented 6 years ago

The formal way to send data from Fluent Bit to Fluentd (or Fluentd to Fluentd) is through the Forward protocol, please refer to the forward plugin documentation.

ekesken commented 6 years ago

the problem with forward plugin, you need to have a "shared key" between clients and server, out_http <--> in_http, you don't need to keep a "shared key", and you don't need to deal with revocation problem of this shared key. so I don't think using "forward plugin" is a proper replacement offer.

edsiper commented 6 years ago

are you actually using some authentication mechanism over HTTP ?

ekesken commented 6 years ago

we're using basic auth with ssl in front of fluentd endpoint, but this is not just about authentication, agents behind http proxies might force you to use out_http <---> in_http setup as well, so I believe it should support msgpack format in some way.

hochdorf commented 6 years ago

Actually, this can be also important if the fluentd instance is running in a kubernetes cluster behind an ingress controller. If your cluster has only one public endpoint(eg: port 80) then you need to use HTTP. Because this protocol is supported by the know ingress controllers and only this one can be routed dynamically based on the domain name to the corresponding pod. If you use plain TCP connection (used by the forward plugin) then your ingress need to have a dedicated port which is assigned only to the fluentd, fluentbit traffic. This is problematic in many situations...

toconnor commented 6 years ago

Can the "fixed" label be removed? The workaround of using the forward plug-in is not applicable in all cases for the reasons mentioned above (revocation of shared key, http proxies, ingress controller). Seems like out_http -> in_http with msgpack should be functional.

countingtoten commented 5 years ago

I also encountered this issue. Our edge firewall doesn't allow TCP, only HTTP(S) requests. We had to work around fluent-bit to fluentd via http not working to get logs from services running outside of our network.

speedchair commented 5 years ago

Same issue, same use case as above.

naphta commented 5 years ago

Same issue for me, similar use case. Seems bizarre that they aren't compatible. I tried using the forward plugin originally but had issues routing the TCP traffic through kubernetes.

Seems ridiculous that this has been marked as fixed.

toconnor commented 5 years ago

@edsiper Can we have the "fixed" label removed from this issue? Using the forward plug-in is not applicable in all cases. Seems like out_http -> in_http with msgpack should be functional.

edsiper commented 4 years ago

Moving this as an enhancement request.