Open joegallo opened 1 year ago
Pinging @elastic/es-data-management (Team:Data Management)
Another tiny improvement we could make while we're in the area:
joegallo@galactic:~/Code/elastic/elasticsearch $ git diff upstream/main...enrich-cache-wip-2 -- x-pack/plugin/enrich/src/main/java/org/elasticsearch/xpack/enrich/AbstractEnrichProcessor.java
diff --git a/x-pack/plugin/enrich/src/main/java/org/elasticsearch/xpack/enrich/AbstractEnrichProcessor.java b/x-pack/plugin/enrich/src/main/java/org/elasticsearch/xpack/enrich/AbstractEnrichProcessor.java
index 114a26bbe83..c9a2ad5bc82 100644
--- a/x-pack/plugin/enrich/src/main/java/org/elasticsearch/xpack/enrich/AbstractEnrichProcessor.java
+++ b/x-pack/plugin/enrich/src/main/java/org/elasticsearch/xpack/enrich/AbstractEnrichProcessor.java
@@ -24,6 +24,7 @@ import java.util.function.BiConsumer;
public abstract class AbstractEnrichProcessor extends AbstractProcessor {
private final String policyName;
+ private final String enrichIndexBaseName;
private final BiConsumer<SearchRequest, BiConsumer<List<Map<?, ?>>, Exception>> searchRunner;
private final TemplateScript.Factory field;
private final TemplateScript.Factory targetField;
@@ -46,6 +47,7 @@ public abstract class AbstractEnrichProcessor extends AbstractProcessor {
) {
super(tag, description);
this.policyName = policyName;
+ this.enrichIndexBaseName = EnrichPolicy.getBaseName(policyName);
this.searchRunner = searchRunner;
this.field = field;
this.targetField = targetField;
@@ -77,7 +79,7 @@ public abstract class AbstractEnrichProcessor extends AbstractProcessor {
searchBuilder.fetchSource(true);
searchBuilder.query(constantScore);
SearchRequest req = new SearchRequest();
- req.indices(EnrichPolicy.getBaseName(getPolicyName()));
+ req.indices(enrichIndexBaseName);
req.preference(Preference.LOCAL.type());
req.source(searchBuilder);
Every little bit helps.
If I'm reading this right, each CacheKey (in my little example) is about 600 bytes. Which isn't huge, but it isn't tiny either. Could we do better?
And here's the cached values, each weighing ~1.3k and corresponding to our in-memory representation of something like the little doc snippet at the bottom there:
"_source" : {
"zip" : 70116,
"last_name" : "Brown",
"state" : "LA",
"city" : "New Orleans",
"first_name" : "Mardy",
"email" : "user1@asciidocsmith.com"
}
edit: converting to Immutable collections rather than Unmodifiable wrappers around collections shaves 300 bytes off the results -- double edit though: that's not safe if the keys or values could be null, which I suppose it is the case that at least the values could be...
private void evictEntry(Entry<K, V> entry) {
assert lruLock.isHeldByCurrentThread(); // lruLock held, amazing
CacheSegment segment = getCacheSegment(entry.key); // lols, hashCode call inside here
if (segment != null) {
segment.remove(entry.key, entry.value, false);
}
delete(entry, RemovalNotification.RemovalReason.EVICTED);
}
void remove(K key, V value, boolean notify) {
CompletableFuture<Entry<K, V>> future;
boolean removed = false;
try (ReleasableLock ignored = writeLock.acquire()) { // writeLock held, amazing
future = map == null ? null : map.get(key); // lols, hashCode call inside here
try {
if (future != null) {
if (future.isDone()) {
Entry<K, V> entry = future.get();
if (Objects.equals(value, entry.value)) {
removed = map.remove(key, future); // lols, hashCode call inside here
if (map.isEmpty()) {
map = null;
}
OMG. @jbaiera and I are talking about this, and he realized that we're calling hashCode three times in the evictEntry
path. For two of those calls we're holding the write lock, and for all three we're holding the lru lock.
He and I are going to walk into a lake now.
(edit: it's not that holding locks while computing hashCode is all that bad, it's more that it assumes that computing the hashCode
is nearly instantaneous, which in this case it is most certainly not.)
Here's a snippet from
GET _nodes/stats
for a simple example on my machine:That
enrich
processor is costing us 44 microseconds per document. But the dataset in question is actually super cache-able:44 microseconds per document is way too high a performance price to paying for cache hits. So why's it so slow?
https://github.com/elastic/elasticsearch/blob/92ac03fcc5cdcf4ca611f34f96f21e048a378397/x-pack/plugin/enrich/src/main/java/org/elasticsearch/xpack/enrich/EnrichCache.java#L129-L151
We use the above class for the cache keys for potentially avoiding execution of
SearchRequest
s (no need to execute aSearchRequest
if we already know what it will return). But it's not really a great class for that.First,
SearchRequest
is mutable. We don't happen to mutate it, but still, it would be better if we were using a properly immutable class in the key.Second,
SearchRequest
isn't some trivial class, there's a deep object tree there, and sohashCode
andequals
actually has quite a bit of work to do (see the definition ofhashCode
/equals
forSearchRequest
then click through tosource
which is aSearchSourceBuilder
and scope out itshashCode
/equals
, then rinse and repeat, also note that all thoseObjects.equals
calls are an array allocation unless something JIT'ed it out).Doubly-worse, our caching implementation must execute
hashCode
twice -- once to find the right underlyingHashMap
to talk to (arguably we control this, it's called viagetCacheSegment
), and then a second time within the implementation ofHashMap
itself (we don't control this).So, even with a better key, it would still probably be worth it to pre-calculate the
hashCode
eagerly at construction time and then just return it repeatedly, like howSnapshot
already does the same:https://github.com/elastic/elasticsearch/blob/92ac03fcc5cdcf4ca611f34f96f21e048a378397/server/src/main/java/org/elasticsearch/snapshots/Snapshot.java#L76-L83
Meta: I went with
>enhancement
for this, but it's pretty close to>bug
territory in my mind.