elastic / logstash

Logstash - transport and process your logs, events, or other data
https://www.elastic.co/products/logstash
Other
67 stars 3.5k forks source link

PluginListing: logstash-filter-ipinfo-db #15089

Open awaismslm opened 1 year ago

awaismslm commented 1 year ago

The ipinfo filter adds information about the geographical location and autonomous system number (ASN) of IP addresses, based on data from the ipinfo.io free country and ASN database.

Link to the plugin repository.

Link to the docs/index.asciidoc.

The code is consistent with the naming and quality of the code. Documentation is done carefully and reviewed. Code reviews are done by the company's senior Engineers. The plugin contains tests and tests are passing.

awaismslm commented 1 year ago

@andsel Can you please look into it?

andsel commented 1 year ago

Hi @awaismslm as I can understand this plugin maps an IP to some geographical informations. It downloads a file which is later loaded by MaxMind DB, the file is downloaded with an hardcoded token.

Please could you provide insights how this plugin differs from the already existing geoip filter ?

awaismslm commented 1 year ago

Hey @andsel, thanks for getting back. The IPinfo filter plugin differs from the GeoIP filter in two areas - precision and auto-download.

In the GeoIP filter, the user has to manually signup, download and install the database file in the correct directory for Logstash to pick it up. This is a pretty bad Dev UX and the IPinfo filter automatically downloads this in the background without any signups or installation required.

Secondly, IPinfo's precision on its free dataset is higher than the one provided by the GeoIP dataset which is purposely quite imprecise (to upsell).

This filter is going to encourage more usage of IP geolocation data with Logstash.

andsel commented 1 year ago

Hi @awaismslm , I foresee some problems or inconsistencies with this approach.

In the GeoIP filter, the user has to manually signup, download and install the database file in the correct directory for Logstash to pick it up.

That's true, but it's required by the commercial license from MaxMind. If you use a commercial DB, because it's licensed to you, you are responsible for it.

This is a pretty bad Dev UX and the IPinfo filter automatically downloads this in the background

I don't know precisely what you mean with "automatically downloads this in the background", from the code it downloads during the register phase of the plugin the first time it doesn't find the DB file and never update it again. This happens in a blocking fashion during the registration, there isn't any background task that does it. https://github.com/ipinfo/logstash-filter-ipinfo-db/blob/375e653e22a6defa3e3c3c2cdbd66d40c8699996/lib/logstash/filters/ipinfo-db.rb#L64-L67

without any signups or installation required

The database is downloaded from the an url which contains a token https://github.com/ipinfo/logstash-filter-ipinfo-db/blob/375e653e22a6defa3e3c3c2cdbd66d40c8699996/lib/logstash/filters/ipinfo-db.rb#L6

So it could be that token is revoked in future, failing to start a pipeline and requiring to update the plugin and release a new version, plus requires that every place where it's used is also manually updated, and that requires a Logstash restart.

Is there any URL to the license that IPInfo exposes to use their DB? I think this could be used a MaxMind proprietary DB and manually update it in geoip filter, in worst case it offers the same UX, because with this plugin after the first download, to have an update the db file has to be manually removed and the pipeline restarted.

awaismslm commented 1 year ago

Hey @andsel! Thanks for getting back.

That's true, but it's required by the commercial license from MaxMind. If you use a commercial DB, because it's licensed to you, you are responsible for it.

Right - with the IPinfo filter approach, this isn't required from our side; the step goes away, enabling it to be more easily available to all people. We haven't added the possibility of integrating paid-for databases into the filter (ones for which the user would be forced to follow a similar experience as above).

I don't know precisely what you mean with "automatically downloads this in the background", from the code it downloads during the register phase of the plugin the first time it doesn't find the DB file and never update it again. This happens in a blocking fashion during the registration, there isn't any background task that does it. https://github.com/ipinfo/logstash-filter-ipinfo-db/blob/375e653e22a6defa3e3c3c2cdbd66d40c8699996/lib/logstash/filters/ipinfo-db.rb#L64-L67

You're right - we do it once and then not again. We'll improve this to both be non-blocking and download on a daily basis (which is the best refresh frequency offered for the underlying DB). Will this point be considered good-to-go once we do that?

The database is downloaded from the an url which contains a token https://github.com/ipinfo/logstash-filter-ipinfo-db/blob/375e653e22a6defa3e3c3c2cdbd66d40c8699996/lib/logstash/filters/ipinfo-db.rb#L6

So it could be that token is revoked in future, failing to start a pipeline and requiring to update the plugin and release a new version, plus requires that every place where it's used is also manually updated, and that requires a Logstash restart.

IPinfo explicitly dedicated that token for public downloads by the Logstash plugin, and revoking it isn't in the books without obviously upsetting our users of this filter - a big no-no for us.

Is there any URL to the license that IPInfo exposes to use their DB?

The DB used for this filter is released under the Creative Commons Attribution-ShareAlike 4.0 International License (https://creativecommons.org/licenses/by-sa/4.0/). Ref https://ipinfo.io/account/data-downloads which also identifies this after signing up on any free account.