Open romanklos87 opened 3 years ago
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
This issue was closed because it has been stalled for 5 days with no activity.
Why this issue is ignored? It is 2022, and problem still remains.
Taken from codeowners as no owner exists for the loki plugin @edsiper @leonardo-albertovich @fujimotos @koleini
Could one of the maintainers please reopen this issue? The problem remains.
After looking at this with the source I found the problem here but I'm not familiar enough for the code base to offer a definitive fix. If a maintainer can point me in the right direction I can open a PR.
The problem is pretty simple it comes from the safe string parsing when generating a JSON string to send in the payload to Loki here. Removing this \n check fixes the issue in Loki but since this is a utility function that's probably not a desired result across all of fluent-bit.
What would be the best way to ensure \n
and \t
are excluded from safe string parsing for the loki output?
The call to it in loki comes from here. No combination of decoders seem to change the outcome here as the \n
is always escaped when parsed for the HTTP Body.
This is similar to a lot of the decoding woe's described here: https://github.com/fluent/fluent-bit/issues/1278 Which @edsiper worked through with some success. Again, with any form of guidance here I could probably open a PR to fix this but changing a utility function doesn't seem like the right approach.
{
"streams": [
{
"stream": {
"label": "value"
},
"values": [
[ "1665195203094100836", "hello\nworld" ]
]
}
]
}
The above results in a properly parsed log line with a new line, and from what I can tell because of the util function the following is being sent instead:
{
"streams": [
{
"stream": {
"label": "value"
},
"values": [
[ "1665195203094100836", "hello\\nworld" ]
]
}
]
}
Feel free to ping me in the public slack server @braunsonm, I'm not sure what the solution might be in this case but if you have a simple repro case I might be able to give you some feedback and guide you through the process.
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale
label.
This continues to be a problem.
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale
label.
Still a problem.
Whats going on about this problem? i
m facing this issue.
TBH, I dropped the ball. I think we'll need some more input here because at the moment I wasn't really convinced of how could we make this change in a way that didn't cause issues and wasn't even sure if it was something that should be done.
My memory is fuzzy due to how much time passed but I think if we want to approach this again we might have to take a step back and consider other options such as making a localized patch in the loki
output plugin.
@leonardo-albertovich
Thanks for replying.
Would this problem be solved if the log collection structure was sent via fluentd rather than directly from fluentbit to loki?
That's something I have absolutely no idea as I have zero experience with fluentd. Sending logs from fluent-bit to fluentd using the forward protocol is really simple though so it might be worth a shot as a temporary workaround.
Any update on this?
We face to the same issue... See \n below
log content: displayed using grep SCN logfile.txt | cat -A
<txt>Completed checkpoint up to RBA [0x21a1.2.10], SCN: 481007916$ </txt>$
fluent-bit generated file (part of it) using Output plugin 'file'
"log":{"message":"Completed checkpoint up to RBA [0x21a1.2.10], SCN: 481007916\n ","time":"2023-05-19T17:24:32.282"}
Agree would be great to fix this. I know @leonardo-albertovich was looking into it at one point
I did and I gave up at the moment. I think I communicated what I found and why I couldn't wrap it up but I can't seem to find that now.
I will save this and try to take a look as soon as possible but I can't make any promises. Especially taking in account that I already took a look and for some reason couldn't do it.
I just gave up on the plugin and use the grafana-loki plugin by compiling from https://github.com/grafana/loki/tree/main/clients/cmd/fluent-bit using make and creating a out_grafana_loki.so based on arch, putting that plugin into something like /fluent-bit/etc/out_grafana_loki.so and tell fluent-bit to load it in. then you can use the docs https://grafana.com/docs/loki/latest/clients/fluentbit/ and have more options to set the delivery to loki.
I know this doesn't fix the problem here, but it could be an alternative instead of waiting on someone to fix something no one wants to maintain.
I just gave up on the plugin and use the grafana-loki plugin by compiling from https://github.com/grafana/loki/tree/main/clients/cmd/fluent-bit using make and creating a out_grafana_loki.so based on arch, putting that plugin into something like /fluent-bit/etc/out_grafana_loki.so and tell fluent-bit to load it in. then you can use the docs https://grafana.com/docs/loki/latest/clients/fluentbit/ and have more options to set the delivery to loki.
I know this doesn't fix the problem here, but it could be an alternative instead of waiting on someone to fix something no one wants to maintain.
Is it still compile with 2.1.x FB version ? Since the beginning the Loki plugin give different outputs than the grafana Go plugin :(
I do not know.
My implementation was based off of just using the build process/image from https://github.com/aws/aws-for-fluent-bit.git which is currently on 1.9.10.
This was my build process:
I compile the plugin directly from the loki repo:
ssh ECSBOX-AMD64-CHIPSET # to make sure whatever ECS system can use that plugin
git clone https://github.com/grafana/loki
cd loki
make fluent-bit-plugin # makefile in there to create the binary for grafana/loki for firelens
exit
scp EC2BOX-AMD64-CHIPSET:/clonepath/loki/clients/cmd/fluent-bit/out_grafana_loki.so .
Then added it to the init process for aws-for-fluent-bit:
git clone https://github.com/aws/aws-for-fluent-bit.git
vim Dockerfile.init
#change code
FROM amazon/aws-for-fluent-bit:latest
+ADD out_grafana_loki.so /fluent-bit/
RUN mkdir -p /init
#
vim init/fluent_bit_init_process.go
#change code
// default Fluent Bit command
- baseCommand = "exec /fluent-bit/bin/fluent-bit -e /fluent-bit/firehose.so -e /fluent-bit/cloudwatch.so -e /fluent-bit/kinesis.so"
+ baseCommand = "exec /fluent-bit/bin/fluent-bit -e /fluent-bit/firehose.so -e /fluent-bit/cloudwatch.so -e /fluent-bit/kinesis.so -e /fluent-bit/out_grafana_loki.so"
// global s3 client and flag
#
I built my own image so it had the plugin directly from Loki into the system that AWS maintains. Keeps the maintenance much lower.
Bug Report
Describe the bug I am sending Java Logs in json format from Fluent-bit to Loki new-line (\n) and tab (\t) characters within record are shown as plain text in Grafana. Example is \n instead of newline. It looks like there is an issue with plugin.
The same thing was tested on same instance and application with FluentD. Logs was forwarded from Fluent-bit to FluentD and then to Loki. Grafana shows FluentD json logs correctly.
To Reproduce
Just send json exception log message.
Expected behavior Correct view of exception message in Grafana.( new-line, tab instead of
\n
and\t
)Screenshots Grafana output of exception from Fluent-bit
Grafana output of exception from FluentD (Fluent-bit is collector)
Your Environment
td-agent-bit.x86_64 1.6.6-1
Configuration: Fluent-bit config:
[INPUT] name systemd tag java_app.* systemd_Filter _SYSTEMD_UNIT=java_app.service
[FILTER] Name parser Match java_app.* Key_Name MESSAGE Reserve_Data On Parser json
[OUTPUT] name loki match * host
port 3100
labels job=fluentbit
<match td..> @type tdlog @id output_td apikey YOUR_API_KEY
auto_create_table
<match debug.**> @type stdout @id output_stdout
<label @java_app> <filter **> @type parser key_name MESSAGE reserve_data true remove_key_name_field true
OpenJDK Runtime Environment Corretto-11.0.9.12.1 (build 11.0.9.1+12-LTS) OpenJDK 64-Bit Server VM Corretto-11.0.9.12.1 (build 11.0.9.1+12-LTS, mixed mode)
Grafana v7.3.3 (2489dc4d3a)