Open jpittiglio opened 1 year ago
Pinging @elastic/es-data-management (Team:Data Management)
While a restart is a workaround, a much lighter-weight workaround is making any change to the geoip
processor in the pipeline -- for example, I added a "description": "foo"
to the processor and re-PUT it, and then immediately removed the "description".
Regardless, though, I agree that this is a bug.
edit: a note to myself for myself: ConfigDatabases#updateDatabase
doesn't reload pipelines for which the database was not available like DatabaseNodeService#updateDatabase
which does. It should learn to do that.
Elasticsearch Version
8.10.1
Installed Plugins
No response
Java Version
bundled
OS Version
Official Docker images
Problem Description
In an air-gapped environment, Elasticsearch provides multiple mechanisms for GeoIP database updates, one of which is 'manual' updates as described here: https://www.elastic.co/guide/en/elasticsearch/reference/8.10/geoip-processor.html#manually-update-geoip-databases
When performing the steps above, Elasticsearch recognizes the database has been loaded both according to the logs and the GeoIP stats API endpoint, but active data ingest does not properly enrich the data - it continues to tag new data with
"_geoip_database_unavailable_GeoLite2-City.mmdb"
After restarting Elasticsearch, this works as expected. However, the alternative mechanism (using a custom endpoint - https://www.elastic.co/guide/en/elasticsearch/reference/8.10/geoip-processor.html#use-custom-geoip-endpoint) properly loads and recognizes the database, and begins enriching new data immediately with no restart required.
I believe this is either a bug or a documentation update is required to highlight expected behavior / additional steps required.
Steps to Reproduce
This was initially discovered in a multi-node 8.8.2 cluster running on bare metal, and confirmed in a lab environment running 8.10.1 in a single node Docker container. For simplicity, the docker-elk repository was used: https://github.com/deviantony/docker-elk
A handful of changes were made to the default compose file to ensure proper mounting of config directories (so the GeoLite2 database could be loaded later), some port mapping changes to eliminate conflict with another local cluster running, and the
elasticsearch.yml
had an additional setting added:ingest.geoip.downloader.endpoint: http://kibana:8000/overview.json
Which was later used to test the alternate 'custom endpoint' mechanism. Otherwise, this is effectively a generic / basic installation of Elasticsearch and Kibana.
A very basic Python script using the official Elasticsearch Python library was written to repeatedly attempt ingest of data to simulate the production environment as much as possible - this step may or may not be required, but helps validate results:
Note this is a throwaway cluster with default settings from docker-elk, and thus the hardcoded credentials.
Validate change in logs
tester
ingest pipeline and index template - used Dev Tools for simplicityPUT _template/tester { "index_patterns": [ "tester" ], "settings": { "index.default_pipeline": "tester" }, "mappings": { "properties": { "@timestamp": { "type": "date" }, "source": { "properties": { "ip": { "type": "keyword" }, "geo": { "properties": { "city_name": { "ignore_above": 1024, "type": "keyword" }, "continent_code": { "ignore_above": 1024, "type": "keyword" }, "continent_name": { "ignore_above": 1024, "type": "keyword" }, "country_iso_code": { "ignore_above": 1024, "type": "keyword" }, "country_name": { "ignore_above": 1024, "type": "keyword" }, "location": { "type": "geo_point" }, "name": { "ignore_above": 1024, "type": "keyword" }, "postal_code": { "ignore_above": 1024, "type": "keyword" }, "region_iso_code": { "ignore_above": 1024, "type": "keyword" }, "region_name": { "ignore_above": 1024, "type": "keyword" }, "timezone": { "ignore_above": 1024, "type": "keyword" } } } } }, "tags": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } } } } }
Validate results are tagged (as expected) with the database unavailable message
Follow relevant steps to add local GeoIP database to a local
ingest-geoip
directory - ref: https://www.elastic.co/guide/en/elasticsearch/reference/8.10/geoip-processor.html#manually-update-geoip-databasesConfirm logs indicate database file loaded correctly
Confirm API shows database is recognized
Search of actively ingested documents still shows GeoIP processor failing due to unavailable database (
GET tester/_search?sort=@timestamp:desc
). This is unexpected behavior.Add a new, identical pipeline
Run a
_simulate
and validate data is enriched properly using the GeoIP processorLive ingest will similarly work in a new index with this new pipeline. Everything works as expected after restarting the cluster.
As noted, in a fresh cluster with the custom endpoint configured (ref: https://www.elastic.co/guide/en/elasticsearch/reference/8.10/geoip-processor.html#use-custom-geoip-endpoint), no restart of Elasticsearch is required. Rather than document all these steps (though happy to if desired), the only notable difference observed was the logs have slightly different messages output versus the local file loaded as described above:
Logs (if relevant)
Relevant logs were captured in steps above, however for simple comparison:
Logs using the 'local file' mechanism:
Logs using the 'custom endpoint' web server mechanism: