Baqend / Orestes-Bloomfilter

Library of different Bloom filters in Java with optional Redis-backing, counting and many hashing options.
Other
839 stars 245 forks source link

Support redis cluster mode? #54

Open Jaskey opened 6 years ago

Jaskey commented 6 years ago

In production, we probably use cluster mode , but for now, OBF does not support this

dvizzini commented 5 years ago

@Jaskey What did you end up doing?

I might take a stab at implementing a Redis cluster implementation.

dvizzini commented 5 years ago

JedisCluster currently lacks a pipeline, and getting one in does not look too promising: https://github.com/xetorthio/jedis/pull/1455

dvizzini commented 5 years ago

I think Ima make this into a PR:

import com.github.benmanes.caffeine.cache.Caffeine;
import com.github.benmanes.caffeine.cache.LoadingCache;
import redis.clients.jedis.HostAndPort;
import redis.clients.jedis.Jedis;
import redis.clients.jedis.JedisCluster;
import redis.clients.jedis.exceptions.JedisMovedDataException;

import java.util.stream.Stream;

public class ClusteredRedisBloomFilterCache<K> {

    final private LoadingCache<K, BloomFilter<K>> bloomFilters;

    public ClusteredRedisBloomFilterCache(
            final JedisCluster jedisCluster,
            final int expectedElements,
            final double falsePositiveProbability
    ) {
        this.bloomFilters = Caffeine.newBuilder()
                .build(key -> {
                    final HostAndPort hostAndPort = getBloomfilterHostPort(jedisCluster, key);
                    return new FilterBuilder(expectedElements, falsePositiveProbability)
                            .name(key.toString())
                            .redisBacked(true)
                            .redisHost(hostAndPort.getHost())
                            .redisPort(hostAndPort.getPort())
                            .overwriteIfExists(false)
                            .buildBloomFilter();
                });
    }

    public ClusteredRedisBloomFilterCache(final JedisCluster jedisCluster) {
        this(jedisCluster, 1_000_000_000, 0.01);
    }

    public BloomFilter<K> get(K key) {
        return this.bloomFilters.get(key);
    }

    private HostAndPort getBloomfilterHostPort(final JedisCluster jedisCluster, final K key) {
        // Looked into cleaner methods. Jedis seems to hide all better ways.
        return jedisCluster.getClusterNodes().entrySet().stream()
                // Returns HostAndPort of redis node assigned to parameter key
                .flatMap(entry -> {
                    try (final Jedis jedis = entry.getValue().getResource()) {
                        try {
                            // Will fail if this redis node is not assigned to parameter key
                            jedis.get(key.toString());
                            // Will above did not fail, will return this redis node's HostAndPort
                            final String[] hostAndPort = entry.getKey().split(":");
                            final String host = hostAndPort[0];
                            final int port = Integer.valueOf(hostAndPort[1]);
                            return Stream.of(new HostAndPort(host, port));

                        } catch (final JedisMovedDataException jmde) {
                            // jedis.get failed. filtering out this redis node's HostAndPort
                            return Stream.empty();
                        }
                    }
                })
                // Should always find node in properly configured Redis cluster
                .findFirst().get();
    }
}
thai-op commented 2 weeks ago

There is support for cluster pipeline: https://github.com/redis/jedis/blob/master/src/main/java/redis/clients/jedis/ClusterPipeline.java