apache / incubator-hugegraph

A graph database that supports more than 100+ billion data, high performance and scalability (Include OLTP Engine & REST-API & Backends)
https://hugegraph.apache.org
Apache License 2.0
2.62k stars 518 forks source link

[Bug] cacheEventListener in CachedSchemaTransactionV2 Work does not meet expectations #2617

Open haohao0103 opened 1 month ago

haohao0103 commented 1 month ago

Bug Type (问题类型)

other exception / error (其他异常报错)

Before submit

Environment (环境信息)

Expected & Actual behavior (期望与实际表现)

When using multi server mode, when a client connects to a server to create a graph schema, other servers will not perceive this and perform the action of clearing the local schema cache based on the cacheEventListener; If there is no written action in the future and only one server holds schema data, the schema will appear intermittently on the bubble page Note: The reason why there are no errors during the writing process is that if the required schema cannot be queried in the local cache during the construction of the edge, it will be queried on the PD side and the cache will be updated.

Vertex/Edge example (问题点 / 边数据举例)

No response

Schema [VertexLabel, EdgeLabel, IndexLabel] (元数据结构)

No response

dosubot[bot] commented 1 month ago

The issue you're experiencing with the cacheEventListener in CachedSchemaTransactionV2 not clearing the local schema cache on other servers in a multi-server environment is due to the fact that it only listens to local events and does not propagate cache invalidation events across different servers. The cacheEventListener is registered to listen for cache events within the same instance, but there is no mechanism to broadcast these events to other servers in the cluster.

Here's a snippet from the CachedSchemaTransactionV2 class that illustrates this:

private void listenChanges() {
    // Listen store event: "store.init", "store.clear", ...
    Set<String> storeEvents = ImmutableSet.of(Events.STORE_INIT,
                                              Events.STORE_CLEAR,
                                              Events.STORE_TRUNCATE);
    this.storeEventListener = event -> {
        if (storeEvents.contains(event.name())) {
            LOG.debug("Graph {} clear schema cache on event '{}'",
                      this.graph(), event.name());
            this.clearCache(true);
            return true;
        }
        return false;
    };
    this.store().provider().listen(this.storeEventListener);

    // Listen cache event: "cache"(invalid cache item)
    this.cacheEventListener = event -> {
        LOG.debug("Graph {} received schema cache event: {}",
                  this.graph(), event);
        Object[] args = event.args();
        E.checkArgument(args.length > 0 && args[0] instanceof String,
                        "Expect event action argument");
        if (Cache.ACTION_INVALID.equals(args[0])) {
            event.checkArgs(String.class, HugeType.class, Id.class);
            HugeType type = (HugeType) args[1];
            Id id = (Id) args[2];
            this.invalidateCache(type, id);
            this.resetCachedAll(type);
            return true;
        } else if (Cache.ACTION_CLEAR.equals(args[0])) {
            event.checkArgs(String.class, HugeType.class);
            this.clearCache(false);
            return true;
        }
        return false;
    };
    EventHub schemaEventHub = this.params().schemaEventHub();
    if (!schemaEventHub.containsListener(Events.CACHE)) {
        schemaEventHub.listen(Events.CACHE, this.cacheEventListener);
    }
}

The cacheEventListener is only listening to events from the local EventHub and does not have a mechanism to communicate with other servers' EventHub instances. Therefore, when a client creates a graph schema on one server, the cache invalidation event is not propagated to other servers, resulting in stale cache data on those servers [1].

To address this, you would need to implement a distributed event notification system to propagate events across all instances of the application running on different servers. This could involve using a message broker or a distributed event bus to propagate events to all servers in the cluster.

To continue talking to Dosu, mention @dosu.