Closed wasserman closed 4 months ago
Pinging @elastic/es-core-features (Team:Core/Features)
I'd like to add, it'd be better to support, at minimum, the official MaxMind GeoIP2 database types:
The ingest processor loading code I think can get a bit simpler by leveraging the DatabaseReader.getDatabaseType()
method here, which returns an int as an OR'd enum. This way the fields available are dictated by the embedded metadata and not an arbitrary filename.
Supporting the Enterprise database and the ISP database essentially provides a superset of all standard database fields. It's not clear to me how the Java bindings allow for accessing custom attributes, but that'd be a "nice to have" as well.
Enhancing this ingest processor this way could add immense value to corporate users that would like to enrich data with internal IP geolocation information and possibly subnet names. For my use-case, I am attempting to use the ingest-geoip processor to enrich known bad malware C2 endpoints. Since I'm limited to a city OR an ASN database, I have to use two distinct databases. Using the approach suggested above with the getDatabaseType()
, I think it should be possible to load the City (or Enterprise) fields, and then also load the ASN fields by simply looping over all supported interfaces of the declared database type.
This would be a great enhancement. We will need to reach out to MaxMind to see if they offer sample/test databases we could use for testing.
related: https://github.com/elastic/elasticsearch/issues/80748
+1 to support more commercial MaxMind databases in geoip processor
Please support this which will help to trace the ISP of origin of requests from nginx
Hi @dcode @athanatos64 @truong-hua We are working on adding support for the GeoIP2 Enterprise Database and GeoIP2-Anonymous IP Database to Elasticsearch ingest pipelines.
These files contain different/additional fields than the free GeoLite2 files we currently support. The properties parameter in a geoip processor can be used to specify which fields to return, in case you want more/fewer/different subset than the default. We're trying to decide which fields to return to the target_field by default. For the Anonymous IP file it's a relatively short list so we plan to return most of them by default. The Enterprise file has quite a few fields so we're seeking community feedback for that one.
Can you please respond back with which fields you would typically want by default? The list of available fields are:
GeoIP2 Enterprise Database: "city.name", "continent.name", "country.isoCode", "country.name", "location.latitude", "location.longitude", "location.timeZone", "mostSpecificSubdivision.isoCode", "mostSpecificSubdivision.name", "traits.anonymous", "traits.anonymousVpn", "traits.autonomousSystemNumber", "traits.autonomousSystemOrganization", "traits.hostingProvider", "traits.network", "traits.publicProxy", "traits.residentialProxy", "traits.torExitNode", "city.confidence", "city.geoNameId", "city.names", "continent.code", "continent.geoNameId", "continent.names", "country.confidence", "country.geoNameId", "country.inEuropeanUnion", "country.names", "leastSpecificSubdivision.confidence", "leastSpecificSubdivision.geoNameId", "leastSpecificSubdivision.isoCode", "leastSpecificSubdivision.name", "leastSpecificSubdivision.names", "location.accuracyRadius", "location.averageIncome", "location.metroCode", "location.populationDensity", "maxMind", "mostSpecificSubdivision.confidence", "mostSpecificSubdivision.geoNameId", "mostSpecificSubdivision.names", "postal.code", "postal.confidence", "registeredCountry.confidence", "registeredCountry.geoNameId", "registeredCountry.inEuropeanUnion", "registeredCountry.isoCode", "registeredCountry.name", "registeredCountry.names", "representedCountry.confidence", "representedCountry.geoNameId", "representedCountry.inEuropeanUnion", "representedCountry.isoCode", "representedCountry.name", "representedCountry.names", "representedCountry.type", "subdivisions.confidence", "subdivisions.geoNameId", "subdivisions.isoCode", "subdivisions.name", "subdivisions.names", "traits.anonymousProxy", "traits.anycast", "traits.connectionType", "traits.domain", "traits.ipAddress", "traits.isp", "traits.legitimateProxy", "traits.mobileCountryCode", "traits.mobileNetworkCode", "traits.organization", "traits.satelliteProvider", "traits.staticIpScore", "traits.userCount", "traits.userType"
cc @joegallo
The GeoIP processor support database_file for an alternative database from maxmind. It would be nice to be able to use the ISP database from https://www.maxmind.com/en/geoip2-isp-database.
I prepared a bundle per https://www.elastic.co/guide/en/cloud/current/ec-custom-bundles.html#ec-prepare-custom-bundles. Used a sample from https://github.com/maxmind/MaxMind-DB/blob/main/test-data/GeoIP2-ISP-Test.mmdb. JSON representation of the file for reference is at https://github.com/maxmind/MaxMind-DB/blob/main/source-data/GeoIP2-ISP-Test.json
When I tried to use this database_file the error was:
The section of code that shows this limitation is here: https://github.com/elastic/elasticsearch/blob/425ed4cbc1f3f2bd2ca82091bc357f263687b149/modules/ingest-geoip/src/main/java/org/elasticsearch/ingest/geoip/GeoIpProcessor.java
I hope it is as easy as implementing
retrieveISPGeoData
and then whitelisting the ISP database filename.Thanks!