grafana / loki

Like Prometheus, but for logs.
https://grafana.com/loki
GNU Affero General Public License v3.0
23.68k stars 3.42k forks source link

Add support for Geo IP #2120

Closed neilmunday closed 1 year ago

neilmunday commented 4 years ago

Is your feature request related to a problem? Please describe. No

Describe the solution you'd like It would be great if promtail/loki had a GEO IP feature like LogStash. E.g. regex identifies IP addresses in log message and performs GEO IP look-up to add additional fields to store location. This could then be used by the Grafana World Map plugin - though this plugin may also need updating.

Describe alternatives you've considered

ELK. It already has this feature -> LogStash Geo IP filter + Kibana world map.

Additional context Add any other context or screenshots about the feature request here.

adityacs commented 4 years ago

@cyriltovena @slim-bean This is a good feature to support. However, this requires us to package the geolite2 or any other similar database file along side Promtail. Also, we should figure out if the license allows us to do this.

WDYT?

WarraxUA commented 4 years ago

nginx with the module ngx_http_geoip_module can writes geodata tag to the access log just need a new "Worldmap Panel Plugin" for Grafana with support as a datasource - Loki

nginx log WITH GEODATA TAG -> Promtail -> Loki -> Grafana

P.S. I’m just surprised that the Grafana lab didn’t realize such a simple thing even at the time of the announcement of Loki

wardbekker commented 4 years ago

Hi @WarraxUA and folks. I've been using an preview branch of the upcoming metrics and field extraction feature. This allowed me to build the below dashboard, with metrics on high cardinality fields. For the Worldmap I've added the GEOIP module to Nginx, and added the country name to the log output. With the following expression I was able to sum by countryname as input for the worldpanel. (syntax pending to change, and it's a bit double escaped sum by (country_code) (count_over_time({filename=\"/var/log/nginx/access.log\"} | regexp \"HTTP\\\\/1\\\\.1\\\" (?P<statuscode>\\\\d{3}) (?P<bytessent>\\\\d+) (?P<refferer>\\\".*?\\\") \\\"(?P<useragent>.*)\\\" \\\"(?P<country_code>.*)\\\"\"[$__interval]))

web_analytics_dashboard_4
neilmunday commented 4 years ago

Looking good!

Wnthr commented 4 years ago

Hi @WarraxUA and folks. I've been using an preview branch of the upcoming metrics and field extraction feature. This allowed me to build the below dashboard, with metrics on high cardinality fields. For the Worldmap I've added the GEOIP module to Nginx, and added the country name to the log output. With the following expression I was able to sum by countryname as input for the worldpanel. (syntax pending to change, and it's a bit double escaped sum by (country_code) (count_over_time({filename=\"/var/log/nginx/access.log\"} | regexp \"HTTP\\\\/1\\\\.1\\\" (?P<statuscode>\\\\d{3}) (?P<bytessent>\\\\d+) (?P<refferer>\\\".*?\\\") \\\"(?P<useragent>.*)\\\" \\\"(?P<country_code>.*)\\\"\"[$__interval]))

web_analytics_dashboard_4

That looks better than good to me, is the preview branch you speak of available publically, alternatively, is there a time frame in which it will be available? I am currently in the prototyping stage of my project, so running unreleased isn't a concern.

cyriltovena commented 4 years ago

Here is the repo https://github.com/cyriltovena/demo/blob/master/logql/docker-compose.yaml#L8

There’s a small readme but also I gave a talk at GrafanaCon about this https://grafana.com/go/grafanaconline/loki-future/ see at the end, when you hear my weird and funny french accent you found it 😂

For ETA this is hard we’re still trying to make sure the syntax is easy to use and learn as we will live with this forever.

So soon TM.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

dfoxg commented 4 years ago

Are here any updates? It's a important feature

afletch commented 4 years ago

Being able to enrich data either upon collection in promtail (via a plugin?) or when that data lands in Loki, is really very important.

adeleglise commented 4 years ago

I would love seeing this too.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

neilmunday commented 4 years ago

Any news?

jonkristian commented 4 years ago

Here's just a thought. Wouldn't it be more preferable to enrich data after the logs are collected, then one would not have to add extra overhead on the web server. To be honest though I really don't know how much overhead the geoip would add, but if you have many sites it could impact.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

neilmunday commented 3 years ago

A comment to keep this issue open.

WarraxUA commented 3 years ago

example dashboard from wardbekker1 with Geo_IP https://grafana.com/grafana/dashboards/12559

afletch commented 3 years ago

The dashboards shared in this thread are very nice, but they don't address the issue highlighted by the OP, which is; there is currently no way to enrich data either within Promtail or at the point of ingestion into Loki. GeoIP is a good example of this, but it would apply to any enrichment of collected log data using external lookups. So, if you have a GeoIP field in the source log data, extracting it (and displaying it) is easy enough. If you don't have GeoIP in the source, then adding this label data is not possible in the flow at present.

(this is possible using fluend as a client, but then you're stepping outside the stack)

Hope this helps clarify what is being requested here, as things seem to have got muddied over time.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

neilmunday commented 3 years ago

A comment to keep this issue open.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

LuciferInLove commented 3 years ago

Just a comment to keep this issue open.

proffalken commented 3 years ago

I'd also be interested in this.

My particular use-case is taking logs from network devices into Loki via promtail and then mapping them in Grafana.

I'm basically trying to setup a similar solution to the Elastic.co SIEM that comes with Kibana, but using Grafana/Prometheus/Loki

steverweber commented 3 years ago

would be nice if in the promtail pipline could exec/shell out to enrich. That could allow something like using the debian package geoip-bin to use some pretend config like.

match: ... # regex to grab out an IP in an apache or nginx log
exec_cache_ttl: 10s # option to cache the exec call.
exec: /bin/bash -c "geoiplookup {regex_group_match_ip}"
proffalken commented 3 years ago

FWIW, I abandoned this idea and ended up using a combination of FluentD and Fluent-bit to do the GeoIP lookups and get the data into Loki.

That works really well, so I'm seriously considering dropping promtail completely from my infrastructure as I'm struggling to find the problem it solves right now... :( 2021-03-16_07-33

Wnthr commented 3 years ago

FWIW, I abandoned this idea and ended up using a combination of FluentD and Fluent-bit to do the GeoIP lookups and get the data into Loki.

That works really well, so I'm seriously considering dropping promtail completely from my infrastructure as I'm struggling to find the problem it solves right now... :( 2021-03-16_07-33

Very nice! Do you enrich all the log lines before adding them to loki, then? I was hoping to not have to add the info to all lines, but rather have the equivalent of an SQL outer join after the lookup was done, but so far, that doesn't seem possible.

After your comment, I have been looking into fluentd as well, which seems to be a much less constricted way of massaging logs, and with the loki plugin, it seems that it will solve my promtail woes on a more fundamental level, so it seems my infrastructure will move away from promtail as well. Thank you for the (unintended) suggestion!

proffalken commented 3 years ago

Very nice! Do you enrich all the log lines before adding them to loki, then?

Thanks, and yes, that's exactly what I do.

I'm having some teething issues around doing this purely in fluent-bit, but feeding the logs to FluentD and then using the geoip filter to append the data to each line.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

Ognian commented 3 years ago

I'm interested in this too! Promtail should implement something like this proposed above.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

kittydoor commented 3 years ago

This issue is still very relevant.

davidspek commented 3 years ago

This would be useful to enrich the logs from Istio which doesn't have built-in GeoIP support like NGINX.

Jacq commented 3 years ago

To include this functionality in loki or promtail will be very useful, currently I am streaming the messages through node-red to enrich the messages with geoip.

stale[bot] commented 3 years ago

Hi! This issue has been automatically marked as stale because it has not had any activity in the past 30 days. We use a stalebot among other tools to help manage the state of issues in this project. A stalebot can be very useful in closing issues in a number of cases; the most common is closing issues or PRs where the original reporter has not responded. Stalebots are also emotionless and cruel and can close issues which are still very relevant. If this issue is important to you, please add a comment to reopen it. More importantly, please add a thumbs-up to the original issue entry. We regularly sort for closed issues which have a stale label sorted by thumbs up. We may also:

We are doing our best to respond, organize, and prioritize all issues but it can be a challenging task, our sincere apologies if you find yourself at the mercy of the stalebot.

dfoxg commented 3 years ago

Ping

vsplunk commented 3 years ago

@proffalken have you used fluentd and fluentbit helm to achieve this? will you be able to share your helm chart to send logs from fluent bit >> fluentd to loki?

proffalken commented 3 years ago

@vsplunk I've just published how I'm doing it - I'm using Nomad instead of k8's, but the Nomad manifest is at https://github.com/BudgetSmartHome/home-lab-configs/blob/main/nomad-manifests/monitoring_stack.nomad if that helps you?

M-JobPixel commented 3 years ago

I'm looking to do this and will probably use logstash to do the enrichment.

But I'd be much happier to be able to enrich the data in Loki at lookup time so as to only enrich the data which people are looking at rather than enrich all the data on the way into the database^W block store.

I think this is what https://github.com/grafana/loki/issues/2120#issuecomment-728099659 was driving at too.

pmorange commented 2 years ago

Hi, I managed to do it using rsyslog and lognormalizer, if anyone is interested just ask. I didn't want to install a big log enrichment solution such as Logstash or fluentd, so I want the lightweight route. Not without its hiccups but now it's working.

M-JobPixel commented 2 years ago

I'm interested in how you did this @pmorange. Was this enrichment en route to the block store or enrichment of the data before it was displayed in grafana?

pmorange commented 2 years ago

I'm interested in how you did this @pmorange. Was this enrichment en route to the block store or enrichment of the data before it was displayed in grafana?

Hi, What I wanted to achieve was to display in a map in Grafana the location of IPs I had blocked in my geo IPTables rules (thos IPs outside my country). All of this was new to me, of course :-)

Here are the steps :

I have lines like these in the iptables log file :

Nov  3 15:41:22 alpine GeoIP(OUTSIDE OF ZONE):  IN=eth0 OUT=br-d323db5ba3a3 MAC=00:0c:29:d4:68:ca:74:da:88:20:30:0a:08:00 SRC=80.82.65.247 DST=172.21.0.2 LEN=44 TOS=00 PREC=0x00 TTL=247 ID=54166 PROTO=TCP SPT=58694 DPT=443 SEQ=1345397855 ACK=0 WINDOW=1024 SYN URGP=0 MARK=0

And now I see this result in real-time of people outside my country that get denied and access to my home network :-) (The 3 biggest circles are test I did with a VPN, but you see I get visits from other countries too, and that's just data of the last 4 hours) image

That quite a lot of steps to achieve something that could be implemented directly in Loki. It was not so easy to make all of this work together, as documentation is often scarce, but at least now it works :-) I have not talked about how I filter in iptables, that was another adventure entirely :-p

M-JobPixel commented 2 years ago

Thanks for the detailed explanation.

This is enriching before putting the data into the blockstore.

I think I will have to do something similar with logstash, but I would be interested in moving the enrichment to Loki so it's only done for the loglines we display and not all the loglines.

Of course I'm not sure that's possible. ;-)

lfasci commented 2 years ago

This functionality would be very useful, if implemented in Loki it's independent from the agent. In the meantime another solution should be to use https://vector.dev/

stale[bot] commented 2 years ago

Hi! This issue has been automatically marked as stale because it has not had any activity in the past 30 days.

We use a stalebot among other tools to help manage the state of issues in this project. A stalebot can be very useful in closing issues in a number of cases; the most common is closing issues or PRs where the original reporter has not responded.

Stalebots are also emotionless and cruel and can close issues which are still very relevant.

If this issue is important to you, please add a comment to keep it open. More importantly, please add a thumbs-up to the original issue entry.

We regularly sort for closed issues which have a stale label sorted by thumbs up.

We may also:

We are doing our best to respond, organize, and prioritize all issues but it can be a challenging task, our sincere apologies if you find yourself at the mercy of the stalebot.

onedr0p commented 2 years ago

I've moved from promtail to vector and have this working pretty well. Vector is highly configurable.

stale[bot] commented 2 years ago

Hi! This issue has been automatically marked as stale because it has not had any activity in the past 30 days.

We use a stalebot among other tools to help manage the state of issues in this project. A stalebot can be very useful in closing issues in a number of cases; the most common is closing issues or PRs where the original reporter has not responded.

Stalebots are also emotionless and cruel and can close issues which are still very relevant.

If this issue is important to you, please add a comment to keep it open. More importantly, please add a thumbs-up to the original issue entry.

We regularly sort for closed issues which have a stale label sorted by thumbs up.

We may also:

We are doing our best to respond, organize, and prioritize all issues but it can be a challenging task, our sincere apologies if you find yourself at the mercy of the stalebot.

kittydoor commented 2 years ago

This issue is still relevant

mpadinhabrandao commented 2 years ago

ping

yuangu commented 2 years ago

Would you like to try this?Another Loki client https://github.com/tsaikd/gogstash

lfasci commented 2 years ago

ping

Nihiue commented 2 years ago

hi, I got a temporary solution and it works well

https://github.com/Nihiue/loki-enhance-middleware