Open owen-d opened 3 years ago
Hi Loki users,
As part of the Fluent Bit team, we want to bring a first-class citizen experience with Loki and we would like to know what are the specific missing features in our built-in connector:
Since the Golang connector will be deprecated, please let us know what is needed to prioritize on our side.
thanks.
@owen-d I swapped over from the native Fluentd implementation to the native Fluent Bit implementation as soon as Loki v2.4 was release and I've been very happy with both parts.
@edsiper I think there are a few outstanding Loki issues on the Fluent Bit repo that need triaging against the latest versions? Off the top of my head the following areas need looking into, but I've not seen any of them since upgrading to the latest versions:
@edsiper We have out of order now available, so we should be able to change the implementation of fluentbit to send batches in parallel.
I would very much like to see this plugin continue to be supported. We prefer having a golang output plugin available, as our team is significantly deeper in Go than C skills, and in addition to supporting the code, we intend to make a couple of small, local modifications.
I understand (correct me if I'm mistaken @edsiper ) that there are a couple of areas in which the native plugin needs to be brought up to par (e.g. support for batch compression), and while those are definitely good things to have, their addition to the C plugin doesn't help our specific case.
So I would ask the Loki team to put off deprecating the golang plugin for now, if possible.
I personally agree with deprecation: there is a fair bit of confusion with mismatches in configuration across the two plugins and so people follow a blog post/etc. for the Grafana one but use the Fluent one and then get failures. There is also the duplication of effort required: implement a feature in one then in the other (maybe slightly differently) plus the fragmentation of features. Having a single official plugin is much preferable for support, documentation, development and testing.
@owen-d what was the outcome of this? Just curious if the plan is to deprecate or not - and when if so?
First used the fluent-bit native and the had to switch to Loki's one as we had failures to send to loki after a while (maybe after a disconnection / small network interruption)
@edsiper Where is functionality of custom labels? In grafana/fluent-bit
is in output to loki LabelMapPath
available.
It would be cool to use custom_label_map.json.
@edsiper Where is functionality of custom labels? In
grafana/fluent-bit
is in output to lokiLabelMapPath
available. It would be cool to use custom_label_map.json.
So, is it possible to implement it?
@nokute78 can you implement the LabelMapPath feature please ? , ref:
https://grafana.com/docs/loki/latest/clients/fluentbit/#labelmappath
@edsiper I created a patch to support label_map_path
https://github.com/fluent/fluent-bit/pull/6040
awesome! thanks @nokute78 !
@edsiper I created a patch to support
label_map_path
fluent/fluent-bit#6040
Awesome! Many thanks @nokute78 @edsiper !
Hello! It's been almost 2 years since the creation of this issue. What is the situation now? Is there a feature roadmap to fill the gaps, if any, between grafana-loki plugin and the Fluentbit's builtin Loki output?
AFAIK, this parameters are not supported by the builtin output:
Parameter | Description | Default |
---|---|---|
BatchWait | Time to wait before send a log batch to Loki, full or not. | 1s |
BatchSize | Log batch size to send a log batch to Loki (unit: Bytes). | 10 KiB (10 * 1024 Bytes) |
Timeout | Maximum time to wait for loki server to respond to a request. | 10s |
MinBackoff | Initial backoff time between retries. | 500ms |
MaxBackoff | Maximum backoff time between retries. | 5m |
And some others could be achieved using other non loki output specific FBit output parameters:
e.g. #1
Parameter | Description | Default |
---|---|---|
MaxRetries | Maximum number of retries when sending batches. Setting it to 0 will retry indefinitely. | 10 |
could be somehow achieved using Retry_Limit parameter
e.g. #2
Parameter | Description | Default |
---|---|---|
Buffer | Enable buffering mechanism | false |
-- | -- | -- |
BufferType | Specify the buffering mechanism to use (currently only dque is implemented). | dque |
DqueDir | Path to the directory for queued logs | /tmp/flb-storage/loki |
DqueSegmentSize | Segment size in terms of number of records per segment | 500 |
DqueSync | Whether to fsync each queue change. Specify no fsync with “normal”, and fsync with “full”. | “normal” |
DqueName | Queue name, must be uniq per output | dque |
buffering could be implemented using own FBit's storage parameters and limiting the size.
Am I right? Are there any other important differences between the 2 implementations?
@aleonsan thanks for the good analysis - any chance you can raise an issue on the OSS repo to track the new features we may need? https://github.com/fluent/fluent-bit
This is a Grafana repo so I cannot comment on their roadmap but from an OSS perspective I'd really like to make sure we have feature parity and a migration approach. We regularly get issues raised due to using the Grafana docs but the OSS image, plus the current Grafana image is now based on an unsupported 1.9 version of OSS - we're up to 2.1.7 as of today with a load of new features including OTEL compliance.
Hello,
First of all, thank you for maintaining cool OSS products.
I have 2 feedback items on the migration from grafana-loki
plugin to loki
plugin
After the migration we observed 6-8x higher traffic on loki-gateway than before. According to my investigation, it seems like grafana-loki
plugin, actually promtail client, uses application/x-protobuf
with snappy compression, but loki
plugin uses applicaiton/json
with no compression. It would be nice if loki
plugin would also support compression to reduce network traffic.
https://github.com/grafana/loki/blob/v2.9.0/clients/pkg/promtail/client/client.go#L442-L453 https://github.com/fluent/fluent-bit/blob/v2.1.8/plugins/out_loki/loki.c#L1566-L1569
loki
plugin sometimes send a large data over 1MB which is rejected by loki-gateway on default. To accept such requests, we had to change client_max_body_size
of loki-gateway to 3m
. According to the fluentbit's docs, a chunk size is usually about 2MB, so we choose 3MB client_max_body_size
. Thus, it would be nice to set 3MB client_max_body_size to loki-gateway on default in the helm chart.
https://nginx.org/en/docs/http/ngx_http_core_module.html#client_max_body_size https://github.com/grafana/helm-charts/blob/loki-distributed-0.74.1/charts/loki-distributed/values.yaml#L1146
@ksauzz any feedback on the OSS side needs to be fed back to the OSS repo rather than this Grafana one otherwise it won't be seen. https://github.com/fluent/fluent-bit
I started with native and had to switch to grafana-loki for this reason:
The native Fluent-bit loki plugin does not support a custom URI , you can only set the Host and the Port, but you have no control over the URI (the path). With grafana-loki plugin, you can set a full Url.
Now I'm struggling with grafana-loki plugin to configure tls, I don't see that it's possible in the documentation, if anyone has an idea please help
@Turkish thanks for your feedback. I have submitted a PR to implement that feature in Fluent Bit:
hey folks, just wanted to check what else is needed to complete the transition, last two missing pieces around compression and configurable URI has been addressed. Please report any missing thing here.
hey folks, just wanted to check what else is needed to complete the transition, last two missing pieces around compression and configurable URI has been addressed. Please report any missing thing here.
It would be interesting to find a solution for how to push structured metadata to Loki using the fluent-bit Loki output.
OSS Fluent Bit does include an additional optional metadata section in every record now, primarily to support some of the OTEL requirements I believe. This potentially could be used.
Hey guys I just saw in this website that the fluentbit grafana helm chart is deprecated now and is recommendable to use the official helm chart. Is it only for the helm chart or is the grafana fluentbit implementation also deprecated? I asked because the official documentation on grafana their fluentbit implementation is still there
Hi Folks, regarding the initial requirements around batching, is this still highly necessary ? I would like to learn from urgency level of this
Hello! A long time ago we wrote a plugin for ingesting logs in Loki from
fluentbit
. These days, there's a native output option for Loki available in fluentbit itself courtesy of @edsiper. I'm opening this issue to solicit feedback from Loki users currently using either of these in hopeful preparation for deprecating our plugin in favor of theirs.We hope the introduction of
out of order
support in Loki has helped make this feasible :)cc @cyriltovena