Open alex-spies opened 5 months ago
Pinging @elastic/es-data-management (Team:Data Management)
From what I can tell, we're blowing up while indexing the geoip data here:
[2024-03-25T04:38:17,628][ERROR][o.e.i.g.GeoIpDownloader ] [test-cluster-0] error downloading geoip database [MyCustomGeoLite2-City.mmdb] [.geoip_databases] org.elasticsearch.index.IndexNotFoundException: no such index [.geoip_databases]
at org.elasticsearch.server@8.12.3-SNAPSHOT/org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.notFoundException(IndexNameExpressionResolver.java:473)
at org.elasticsearch.server@8.12.3-SNAPSHOT/org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$ExplicitResourceNameFilter.ensureAliasOrIndexExists(IndexNameExpressionResolver.java:1603)
at org.elasticsearch.server@8.12.3-SNAPSHOT/org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$ExplicitResourceNameFilter.filterUnavailable(IndexNameExpressionResolver.java:1583)
at org.elasticsearch.server@8.12.3-SNAPSHOT/org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.resolveExpressions(IndexNameExpressionResolver.java:265)
at org.elasticsearch.server@8.12.3-SNAPSHOT/org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndices(IndexNameExpressionResolver.java:340)
at org.elasticsearch.server@8.12.3-SNAPSHOT/org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:331)
at org.elasticsearch.server@8.12.3-SNAPSHOT/org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:90)
at org.elasticsearch.server@8.12.3-SNAPSHOT/org.elasticsearch.action.support.replication.TransportBroadcastReplicationAction.shards(TransportBroadcastReplicationAction.java:183)
at org.elasticsearch.server@8.12.3-SNAPSHOT/org.elasticsearch.action.support.replication.TransportBroadcastReplicationAction$1.accept(TransportBroadcastReplicationAction.java:94)
at org.elasticsearch.server@8.12.3-SNAPSHOT/org.elasticsearch.action.support.replication.TransportBroadcastReplicationAction$1.accept(TransportBroadcastReplicationAction.java:83)
at org.elasticsearch.server@8.12.3-SNAPSHOT/org.elasticsearch.action.ActionRunnable$4.doRun(ActionRunnable.java:95)
at org.elasticsearch.server@8.12.3-SNAPSHOT/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:983)
at org.elasticsearch.server@8.12.3-SNAPSHOT/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.base/java.lang.Thread.run(Thread.java:1583)
That's coming from the try/catch of GeoIpDownloader::processDatabase
. From what I can tell, it looks like the exception happens either during a flush or refresh request in indexChunks
. But immediately before we flush/refresh, we've done index requests into this index. So I have no idea how we'd get no such index [.geoip_databases]
.
Oh, I missed this in the log:
[2024-03-25T04:38:17,241][INFO ][o.e.c.m.MetadataDeleteIndexService] [test-cluster-0] [.geoip_databases/eSscKA11TjCN3mvQhDl9bw] deleting index
This is starting to look like the same geoip downloader race conditions we see a lot.
This looks like issue # 1 from https://github.com/elastic/elasticsearch/issues/92888.
A lot of test failures in the
@Before
setup method, specifically atAll of them fail because the data
databases_count
is smaller than the expected4
.Might be related to https://github.com/elastic/elasticsearch/issues/101418 or https://github.com/elastic/elasticsearch/issues/95496: failed in the same setup method.
Test failures:
Build scan: https://gradle-enterprise.elastic.co/s/ctt7b3ramfg5y/tests/:modules:ingest-geoip:yamlRestTest/org.elasticsearch.ingest.geoip.IngestGeoIpClientYamlTestSuiteIT/test%20%7Byaml=ingest_geoip%2F10_basic%2Fingest-geoip%20installed%7D
Reproduction line:
Applicable branches: 8.12
Reproduces locally?: No
Failure history: Failure dashboard for
org.elasticsearch.ingest.geoip.IngestGeoIpClientYamlTestSuiteIT#test {yaml=ingest_geoip/10_basic/ingest-geoip installed}
&_a=(controlGroupInput:(chainingSystem:HIERARCHICAL,controlStyle:twoLine,ignoreParentSettings:(ignoreFilters:!f,ignoreQuery:!f,ignoreTimerange:!f,ignoreValidations:!t),panels:('0c0c9cb8-ccd2-45c6-9b13-96bac4abc542':(explicitInput:(dataViewId:fbbdc689-be23-4b3d-8057-aa402e9ed0c5,enhancements:(),fieldName:task.keyword,grow:!t,id:'0c0c9cb8-ccd2-45c6-9b13-96bac4abc542',searchTechnique:wildcard,selectedOptions:!(),singleSelect:!t,title:'Gradle%20Task',width:medium),grow:!t,order:0,type:optionsListControl,width:small),'144933da-5c1b-4257-a969-7f43455a7901':(explicitInput:(dataViewId:fbbdc689-be23-4b3d-8057-aa402e9ed0c5,enhancements:(),fieldName:name.keyword,grow:!t,id:'144933da-5c1b-4257-a969-7f43455a7901',searchTechnique:wildcard,selectedOptions:!('test%20%7Byaml%3Dingest_geoip/10_basic/ingest-geoip%20installed%7D'),title:Test,width:medium),grow:!t,order:2,type:optionsListControl,width:medium),'4e6ad9d6-6fdc-4fcc-bf1a-aa6ca79e0850':(explicitInput:(dataViewId:fbbdc689-be23-4b3d-8057-aa402e9ed0c5,enhancements:(),fieldName:className.keyword,grow:!t,id:'4e6ad9d6-6fdc-4fcc-bf1a-aa6ca79e0850',searchTechnique:wildcard,selectedOptions:!('org.elasticsearch.ingest.geoip.IngestGeoIpClientYamlTestSuiteIT'),title:Suite,width:medium),grow:!t,order:1,type:optionsListControl,width:medium)))))Failure excerpt: