Closed neilmunday closed 1 year ago
@cyriltovena @slim-bean This is a good feature to support. However, this requires us to package the geolite2 or any other similar database file along side Promtail. Also, we should figure out if the license
allows us to do this.
WDYT?
nginx with the module ngx_http_geoip_module can writes geodata tag to the access log just need a new "Worldmap Panel Plugin" for Grafana with support as a datasource - Loki
nginx log WITH GEODATA TAG -> Promtail -> Loki -> Grafana
P.S. I’m just surprised that the Grafana lab didn’t realize such a simple thing even at the time of the announcement of Loki
Hi @WarraxUA and folks. I've been using an preview branch of the upcoming metrics and field extraction feature. This allowed me to build the below dashboard, with metrics on high cardinality fields. For the Worldmap I've added the GEOIP module to Nginx, and added the country name to the log output. With the following expression I was able to sum by countryname as input for the worldpanel. (syntax pending to change, and it's a bit double escaped sum by (country_code) (count_over_time({filename=\"/var/log/nginx/access.log\"} | regexp \"HTTP\\\\/1\\\\.1\\\" (?P<statuscode>\\\\d{3}) (?P<bytessent>\\\\d+) (?P<refferer>\\\".*?\\\") \\\"(?P<useragent>.*)\\\" \\\"(?P<country_code>.*)\\\"\"[$__interval]))
Looking good!
Hi @WarraxUA and folks. I've been using an preview branch of the upcoming metrics and field extraction feature. This allowed me to build the below dashboard, with metrics on high cardinality fields. For the Worldmap I've added the GEOIP module to Nginx, and added the country name to the log output. With the following expression I was able to sum by countryname as input for the worldpanel. (syntax pending to change, and it's a bit double escaped
sum by (country_code) (count_over_time({filename=\"/var/log/nginx/access.log\"} | regexp \"HTTP\\\\/1\\\\.1\\\" (?P<statuscode>\\\\d{3}) (?P<bytessent>\\\\d+) (?P<refferer>\\\".*?\\\") \\\"(?P<useragent>.*)\\\" \\\"(?P<country_code>.*)\\\"\"[$__interval]))
That looks better than good to me, is the preview branch you speak of available publically, alternatively, is there a time frame in which it will be available? I am currently in the prototyping stage of my project, so running unreleased isn't a concern.
Here is the repo https://github.com/cyriltovena/demo/blob/master/logql/docker-compose.yaml#L8
There’s a small readme but also I gave a talk at GrafanaCon about this https://grafana.com/go/grafanaconline/loki-future/ see at the end, when you hear my weird and funny french accent you found it 😂
For ETA this is hard we’re still trying to make sure the syntax is easy to use and learn as we will live with this forever.
So soon TM.
This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.
Are here any updates? It's a important feature
Being able to enrich data either upon collection in promtail (via a plugin?) or when that data lands in Loki, is really very important.
I would love seeing this too.
This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.
Any news?
Here's just a thought. Wouldn't it be more preferable to enrich data after the logs are collected, then one would not have to add extra overhead on the web server. To be honest though I really don't know how much overhead the geoip would add, but if you have many sites it could impact.
This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.
A comment to keep this issue open.
example dashboard from wardbekker1 with Geo_IP https://grafana.com/grafana/dashboards/12559
The dashboards shared in this thread are very nice, but they don't address the issue highlighted by the OP, which is; there is currently no way to enrich data either within Promtail or at the point of ingestion into Loki. GeoIP is a good example of this, but it would apply to any enrichment of collected log data using external lookups. So, if you have a GeoIP field in the source log data, extracting it (and displaying it) is easy enough. If you don't have GeoIP in the source, then adding this label data is not possible in the flow at present.
(this is possible using fluend as a client, but then you're stepping outside the stack)
Hope this helps clarify what is being requested here, as things seem to have got muddied over time.
This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.
A comment to keep this issue open.
This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.
Just a comment to keep this issue open.
I'd also be interested in this.
My particular use-case is taking logs from network devices into Loki via promtail and then mapping them in Grafana.
I'm basically trying to setup a similar solution to the Elastic.co SIEM that comes with Kibana, but using Grafana/Prometheus/Loki
would be nice if in the promtail pipline could exec/shell out to enrich.
That could allow something like using the debian package geoip-bin
to use some pretend config like.
match: ... # regex to grab out an IP in an apache or nginx log
exec_cache_ttl: 10s # option to cache the exec call.
exec: /bin/bash -c "geoiplookup {regex_group_match_ip}"
FWIW, I abandoned this idea and ended up using a combination of FluentD and Fluent-bit to do the GeoIP lookups and get the data into Loki.
That works really well, so I'm seriously considering dropping promtail completely from my infrastructure as I'm struggling to find the problem it solves right now... :(
FWIW, I abandoned this idea and ended up using a combination of FluentD and Fluent-bit to do the GeoIP lookups and get the data into Loki.
That works really well, so I'm seriously considering dropping promtail completely from my infrastructure as I'm struggling to find the problem it solves right now... :(
Very nice! Do you enrich all the log lines before adding them to loki, then? I was hoping to not have to add the info to all lines, but rather have the equivalent of an SQL outer join after the lookup was done, but so far, that doesn't seem possible.
After your comment, I have been looking into fluentd as well, which seems to be a much less constricted way of massaging logs, and with the loki plugin, it seems that it will solve my promtail woes on a more fundamental level, so it seems my infrastructure will move away from promtail as well. Thank you for the (unintended) suggestion!
Very nice! Do you enrich all the log lines before adding them to loki, then?
Thanks, and yes, that's exactly what I do.
I'm having some teething issues around doing this purely in fluent-bit, but feeding the logs to FluentD and then using the geoip filter to append the data to each line.
This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.
I'm interested in this too! Promtail should implement something like this proposed above.
This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.
This issue is still very relevant.
This would be useful to enrich the logs from Istio which doesn't have built-in GeoIP support like NGINX.
To include this functionality in loki or promtail will be very useful, currently I am streaming the messages through node-red to enrich the messages with geoip.
Hi! This issue has been automatically marked as stale because it has not had any activity in the past 30 days.
We use a stalebot among other tools to help manage the state of issues in this project. A stalebot can be very useful in closing issues in a number of cases; the most common is closing issues or PRs where the original reporter has not responded.
Stalebots are also emotionless and cruel and can close issues which are still very relevant.
If this issue is important to you, please add a comment to reopen it.
More importantly, please add a thumbs-up to the original issue entry.
We regularly sort for closed issues which have a stale
label sorted by thumbs up.
We may also:
revivable
if we think it's a valid issue but isn't something we are likely
to prioritize in the future (the issue will still remain closed).keepalive
label to silence the stalebot if the issue is very common/popular/important.We are doing our best to respond, organize, and prioritize all issues but it can be a challenging task, our sincere apologies if you find yourself at the mercy of the stalebot.
Ping
@proffalken have you used fluentd and fluentbit helm to achieve this? will you be able to share your helm chart to send logs from fluent bit >> fluentd to loki?
@vsplunk I've just published how I'm doing it - I'm using Nomad instead of k8's, but the Nomad manifest is at https://github.com/BudgetSmartHome/home-lab-configs/blob/main/nomad-manifests/monitoring_stack.nomad if that helps you?
I'm looking to do this and will probably use logstash to do the enrichment.
But I'd be much happier to be able to enrich the data in Loki at lookup time so as to only enrich the data which people are looking at rather than enrich all the data on the way into the database^W block store.
I think this is what https://github.com/grafana/loki/issues/2120#issuecomment-728099659 was driving at too.
Hi, I managed to do it using rsyslog and lognormalizer, if anyone is interested just ask. I didn't want to install a big log enrichment solution such as Logstash or fluentd, so I want the lightweight route. Not without its hiccups but now it's working.
I'm interested in how you did this @pmorange. Was this enrichment en route to the block store or enrichment of the data before it was displayed in grafana?
I'm interested in how you did this @pmorange. Was this enrichment en route to the block store or enrichment of the data before it was displayed in grafana?
Hi, What I wanted to achieve was to display in a map in Grafana the location of IPs I had blocked in my geo IPTables rules (thos IPs outside my country). All of this was new to me, of course :-)
Here are the steps :
I have lines like these in the iptables log file :
Nov 3 15:41:22 alpine GeoIP(OUTSIDE OF ZONE): IN=eth0 OUT=br-d323db5ba3a3 MAC=00:0c:29:d4:68:ca:74:da:88:20:30:0a:08:00 SRC=80.82.65.247 DST=172.21.0.2 LEN=44 TOS=00 PREC=0x00 TTL=247 ID=54166 PROTO=TCP SPT=58694 DPT=443 SEQ=1345397855 ACK=0 WINDOW=1024 SYN URGP=0 MARK=0
What Grafana needs if a geo code to display, so I needed to convert IP to a country code (didn't need to be more specific like city details, so I kept this simpler but it would not be difficult to add this level of detail). So I used rsyslog along with lognormalizer to parse the iptables lines, add the geo details and output all of this in a new log file. I set up an account to download the country MaxMind DB and a crontab to update the DB each week. In an rsyslog configuration file I have this ruleset :
ruleset ( name="geoip_ruleset"){
action(type="mmnormalize" rulebase="/etc/lognormalizer/iptables_rule.rb")
if ( $parsesuccess == "OK" ) then {
action(type="mmdblookup" mmdbfile="/var/lib/libmaxminddb/GeoLite2-Country.mmdb"
fields=[":code:!country!iso_code"]
key="!SRC")
action(type = "omfile" file = "/var/log/ulogd_iptables_geoloc.log" template="iptablesgeoip")
}
else if $parsesuccess == "FAIL" then {
action(type="omfile" File="/tmp/parse-failure")
}
along with this template :
template(name="iptablesgeoip" type="string" string="%$!date%-%$!SRC%-%$!src_geo!code%\n")
And the lognormalizer ruleset is the following one (file iptables_rule.rb) :
prefix=%date:date-rfc3164% %host:word% %tag:char-to:\x3a%:
rule=:%-:iptables%
With the log line above, I end up with this line in a new log file (ulogd_iptables_geoloc.log) :
Nov 3 15:41:22-80.82.65.247-NL
=> date + IP + 2 letters country code
job_name: iptables static_configs:
targets:
match: selector: '{job="iptables"}' stages:
In Grafana I now receive this kind of data :
All that is left is to make a new panel of type WorldMap (I am really using Panodata Map Panel but the former works well). I was not able to make the newest GeoMap Panel work, don't ask me why, I just didn't manage to do it yet. The configuration of the panel is like this :
And now I see this result in real-time of people outside my country that get denied and access to my home network :-) (The 3 biggest circles are test I did with a VPN, but you see I get visits from other countries too, and that's just data of the last 4 hours)
That quite a lot of steps to achieve something that could be implemented directly in Loki. It was not so easy to make all of this work together, as documentation is often scarce, but at least now it works :-) I have not talked about how I filter in iptables, that was another adventure entirely :-p
Thanks for the detailed explanation.
This is enriching before putting the data into the blockstore.
I think I will have to do something similar with logstash, but I would be interested in moving the enrichment to Loki so it's only done for the loglines we display and not all the loglines.
Of course I'm not sure that's possible. ;-)
This functionality would be very useful, if implemented in Loki it's independent from the agent. In the meantime another solution should be to use https://vector.dev/
Hi! This issue has been automatically marked as stale because it has not had any activity in the past 30 days.
We use a stalebot among other tools to help manage the state of issues in this project. A stalebot can be very useful in closing issues in a number of cases; the most common is closing issues or PRs where the original reporter has not responded.
Stalebots are also emotionless and cruel and can close issues which are still very relevant.
If this issue is important to you, please add a comment to keep it open. More importantly, please add a thumbs-up to the original issue entry.
We regularly sort for closed issues which have a stale
label sorted by thumbs up.
We may also:
revivable
if we think it's a valid issue but isn't something we are likely
to prioritize in the future (the issue will still remain closed).keepalive
label to silence the stalebot if the issue is very common/popular/important.We are doing our best to respond, organize, and prioritize all issues but it can be a challenging task, our sincere apologies if you find yourself at the mercy of the stalebot.
I've moved from promtail to vector and have this working pretty well. Vector is highly configurable.
Hi! This issue has been automatically marked as stale because it has not had any activity in the past 30 days.
We use a stalebot among other tools to help manage the state of issues in this project. A stalebot can be very useful in closing issues in a number of cases; the most common is closing issues or PRs where the original reporter has not responded.
Stalebots are also emotionless and cruel and can close issues which are still very relevant.
If this issue is important to you, please add a comment to keep it open. More importantly, please add a thumbs-up to the original issue entry.
We regularly sort for closed issues which have a stale
label sorted by thumbs up.
We may also:
revivable
if we think it's a valid issue but isn't something we are likely
to prioritize in the future (the issue will still remain closed).keepalive
label to silence the stalebot if the issue is very common/popular/important.We are doing our best to respond, organize, and prioritize all issues but it can be a challenging task, our sincere apologies if you find yourself at the mercy of the stalebot.
This issue is still relevant
ping
Would you like to try this?Another Loki client https://github.com/tsaikd/gogstash
ping
hi, I got a temporary solution and it works well
Is your feature request related to a problem? Please describe. No
Describe the solution you'd like It would be great if promtail/loki had a GEO IP feature like LogStash. E.g. regex identifies IP addresses in log message and performs GEO IP look-up to add additional fields to store location. This could then be used by the Grafana World Map plugin - though this plugin may also need updating.
Describe alternatives you've considered
ELK. It already has this feature -> LogStash Geo IP filter + Kibana world map.
Additional context Add any other context or screenshots about the feature request here.