Open ldematte opened 9 months ago
Pinging @elastic/es-data-management (Team:Data Management)
In the same test run https://gradle-enterprise.elastic.co/s/d2qw2pauzldhu there are several test all with similar errors - I avoided opening multiple issues, but feel free to split them.
A small detail here, it was interesting to me that the missing file "GeoLite2-ASN.mmdb"
from the failed assert happened to be missing from the 'Found' messages below that come from GeoIpCli
when it is invoked by GeoIpHttpFixture
:
Found GeoLite2-City.mmdb, will compress it to GeoLite2-City.tgz
Found GeoLite2-Country.mmdb, will compress it to GeoLite2-Country.tgz
Found MyCustomGeoLite2-City.mmdb, will compress it to MyCustomGeoLite2-City.tgz
Adding GeoLite2-ASN.tgz to overview.json
Adding GeoLite2-Country.tgz to overview.json
Adding MyCustomGeoLite2-City.tgz to overview.json
Adding GeoLite2-City.tgz to overview.json
overview.json created
However, notice in the following snippet that the file in question is copied as a .tgz
so it's missing from the output merely because it doesn't need to be compressed:
https://github.com/elastic/elasticsearch/blob/54088839b47e2bfffa31c4bee309136cd4b23e48/test/fixtures/geoip-fixture/src/main/java/fixture/geoip/GeoIpHttpFixture.java#L109-L115
Also it's worth noting that all the failures took ten seconds -- so they all individually exhausted the assertBusy
in the @Before
annotated waitForDatabases()
method. So I don't think it's the case that this is just some race condition where if we'd only waited for 15 seconds then everything would have been fine.
diff --git a/modules/ingest-geoip/src/main/java/org/elasticsearch/ingest/geoip/GeoIpDownloader.java b/modules/ingest-geoip/src/main/java/org/elasticsearch/ingest/geoip/GeoIpDownloader.java
index 3e04f7bfea2..ed9e2dbf39d 100644
--- a/modules/ingest-geoip/src/main/java/org/elasticsearch/ingest/geoip/GeoIpDownloader.java
+++ b/modules/ingest-geoip/src/main/java/org/elasticsearch/ingest/geoip/GeoIpDownloader.java
@@ -12,6 +12,7 @@ import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;
import org.elasticsearch.ElasticsearchException;
import org.elasticsearch.action.ActionListener;
+import org.elasticsearch.action.DocWriteResponse;
import org.elasticsearch.action.admin.indices.flush.FlushRequest;
import org.elasticsearch.action.admin.indices.refresh.RefreshRequest;
import org.elasticsearch.action.index.IndexRequest;
@@ -225,10 +226,12 @@ public class GeoIpDownloader extends AllocatedPersistentTask {
MessageDigest md = MessageDigests.md5();
for (byte[] buf = getChunk(is); buf.length != 0; buf = getChunk(is)) {
md.update(buf);
+ if (name.equals("GeoLite2-ASN.mmdb") == false) {
IndexRequest indexRequest = new IndexRequest(DATABASES_INDEX).id(name + "_" + chunk + "_" + timestamp)
.create(true)
.source(XContentType.SMILE, "name", name, "chunk", chunk, "data", buf);
client.index(indexRequest).actionGet();
+ }
chunk++;
}
This addition ends up reproducing the precise shape of this failure. That is, we index the chunks (there happens to just be one for this file, because it is small) as a a one-shot, and if for any reason the indexing doesn't succeed, we just shrug and move on. (Note: except that actionGet should throw, right? Hmmm...)
[2023-10-26T23:13:56,484][INFO ][o.e.i.g.GeoIpDownloader ] [test-cluster-0] successfully downloaded geoip database [GeoLite2-City.mmdb]
1> [2023-10-26T23:13:56,569][INFO ][o.e.i.g.DatabaseNodeService] [test-cluster-0] successfully loaded geoip database file [GeoLite2-City.mmdb]
1> [2023-10-26T23:13:58,020][INFO ][o.e.c.m.MetadataDeleteIndexService] [test-cluster-0] [.geoip_databases/jyYgrbAPQiqD4vB8-JejcA] deleting index
1> [2023-10-26T23:13:58,387][INFO ][o.e.i.g.GeoIpDownloader ] [test-cluster-0] successfully downloaded geoip database [GeoLite2-ASN.mmdb]
1> [2023-10-26T23:13:58,389][WARN ][o.e.i.g.GeoIpDownloader ] [test-cluster-0] could not delete old chunks for geoip database [GeoLite2-ASN.mmdb] [.geoip_databases] org.elasticsearch.index.IndexNotFoundException: no such index [.geoip_databases]
1> at org.elasticsearch.server@8.10.5-SNAPSHOT/org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.notFoundException(IndexNameExpressionResolver.java:460)
1> at org.elasticsearch.server@8.10.5-SNAPSHOT/org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$ExplicitResourceNameFilter.ensureAliasOrIndexExists(IndexNameExpressionResolver.java:1587)
[...]
I wonder if we indexed the document and then deleted the index into which we had indexed the document...
Seems like it could be #92514 / #92888 related... as @masseyke well observed.
Seems related to https://github.com/elastic/elasticsearch/issues/100361 (assertion and discrepancy in sets looks remarkably similar)
Build scan: https://gradle-enterprise.elastic.co/s/d2qw2pauzldhu/tests/:modules:ingest-geoip:yamlRestTestV7CompatTest/org.elasticsearch.ingest.geoip.IngestGeoIpClientYamlTestSuiteIT/test%20%7Byaml=ingest_geoip%2F30_geoip_stats%2FTest%20geoip%20stats%7D Reproduction line:
Applicable branches: 8.10
Reproduces locally?: No
Failure history: https://gradle-enterprise.elastic.co/scans/tests?tests.container=org.elasticsearch.ingest.geoip.IngestGeoIpClientYamlTestSuiteIT&tests.test=test%20%7Byaml%3Dingest_geoip/30_geoip_stats/Test%20geoip%20stats%7D
Failure excerpt: