redis / lettuce

Advanced Java Redis client for thread-safe sync, async, and reactive usage. Supports Cluster, Sentinel, Pipelining, and codecs.
https://lettuce.io
MIT License
5.41k stars 975 forks source link

MGet giving empty result for a key in bundle. #2288

Open gauravgarg-dev opened 1 year ago

gauravgarg-dev commented 1 year ago

Bug Report

We are using Lettuce-core 6.1.8 along spring boot redis 2.7.0. We have configured cluster redis on redis labs with 32 shards.

In the rest call we get 100 Product to query from Redis. We have added slot in PID to combine 100 Mget into few of 10s for optimisation. Example keys looks like {15502}.op:pso-99760635-462532. Where 15502 is generated via

public Long getKeyBasedOnSlot(Long slotV) {
        try {
            return (slotV / 1024) % 16384;
        } catch (Exception e) {
            log.error("Exception while getting slot", e);
        }
        return null;
    }

Code to Fetch data

private List<ProductOfferMap> fetchUsingMget(List<ProductSupplier> productSuppliers) {
        int keyCounter = 0;
        String[] keys = getKeys(productSuppliers);
        List<ProductOfferMap> productOfferMaps = new ArrayList<>();
        if (keys.length == 0) {
            return productOfferMaps;
        }
        try {
            RedisAdvancedClusterCommands<String, String> sync = connection.sync();
            for (KeyValue<String, String> kv : sync.mget(keys)) {
                if (Objects.nonNull(kv.getValueOrElse(null))) {
                    ProductOfferMap productOfferMap = parseValue(kv, productSuppliers, keyCounter);
                    productOfferMaps.add(productOfferMap);
                    keyCounter++;
                }
            }
            return productOfferMaps;
        } catch (Exception e) {
            log.error("Exception while fetching mget for product size : {}: {}",productSuppliers.size(), ExceptionUtils.getStackTrace(e));
            throw new RedisFailureException("Error in fetching data", e.getCause());
        }
    }

While making mget to redis we have observed that, If we query 50 keys along with let say above key , We are getting result of 49 and 1 data is missed in the result intermittently . While key exist in redis but sometime result is not correct. But if we query single in MGET , we are always getting the result in redis-cli and rest API.

To test this further we added pipelining to fetch the data but with GET calls and combined to get the result. We are always getting the data with pipelining. But with mget , data is missed intermittently

Current Behavior

Stack trace ```java // your stack trace here; ```

Input Code

Input Code ```java public Long getKeyBasedOnSlot(Long slotV) { try { return (slotV / 1024) % 16384; } catch (Exception e) { log.error("Exception while getting slot", e); } return null; } private List fetchUsingMget(List productSuppliers) { int keyCounter = 0; String[] keys = getKeys(productSuppliers); List productOfferMaps = new ArrayList<>(); if (keys.length == 0) { return productOfferMaps; } try { RedisAdvancedClusterCommands sync = connection.sync(); for (KeyValue kv : sync.mget(keys)) { if (Objects.nonNull(kv.getValueOrElse(null))) { ProductOfferMap productOfferMap = parseValue(kv, productSuppliers, keyCounter); productOfferMaps.add(productOfferMap); keyCounter++; } } return productOfferMaps; } catch (Exception e) { log.error("Exception while fetching mget for product size : {}: {}",productSuppliers.size(), ExceptionUtils.getStackTrace(e)); throw new RedisFailureException("Error in fetching data", e.getCause()); } } // // Source code recreated from a .class file by IntelliJ IDEA // (powered by FernFlower decompiler) // package com.meesho.redisclient.config; import com.meesho.redisclient.config.properties.IClientResourceProperties; import com.meesho.redisclient.config.properties.IClusterTopologyRefreshProperties; import com.meesho.redisclient.config.properties.IConnectionProperties; import com.meesho.redisclient.config.properties.ITimeoutProperties; import com.meesho.redisclient.config.properties.RedisClusterProperties; import com.meesho.redisclient.config.properties.RedisStandaloneProperties; import io.lettuce.core.ClientOptions; import io.lettuce.core.ReadFrom; import io.lettuce.core.ClientOptions.DisconnectedBehavior; import io.lettuce.core.cluster.ClusterClientOptions; import io.lettuce.core.cluster.ClusterTopologyRefreshOptions; import io.lettuce.core.cluster.ClusterTopologyRefreshOptions.RefreshTrigger; import io.lettuce.core.resource.ClientResources; import io.lettuce.core.resource.DefaultClientResources; import java.time.Duration; import java.util.List; import org.apache.commons.pool2.impl.GenericObjectPoolConfig; import org.springframework.data.redis.connection.RedisClusterConfiguration; import org.springframework.data.redis.connection.RedisStandaloneConfiguration; import org.springframework.data.redis.connection.lettuce.LettuceConnectionFactory; import org.springframework.data.redis.connection.lettuce.LettucePoolingClientConfiguration; import org.springframework.util.StringUtils; public class ConnectionFactoryProvider { public ConnectionFactoryProvider() { } public static LettuceConnectionFactory createConnectionFactory(RedisStandaloneProperties properties) { RedisStandaloneConfiguration configuration = getRedisStandaloneConfiguration(properties.getHost(), properties.getPort(), properties.getPassword(), properties.getDatabase()); GenericObjectPoolConfig genericObjectPoolConfig = getPool(properties); ClientOptions clientOptions = getStandaloneClientOptions(); ClientResources resources = getClientResources(properties); LettucePoolingClientConfiguration.LettucePoolingClientConfigurationBuilder builder = LettucePoolingClientConfiguration.builder().poolConfig(genericObjectPoolConfig).clientOptions(clientOptions).clientResources(resources); setupTimeouts(properties, builder); LettucePoolingClientConfiguration lettucePoolingClientConfiguration = builder.build(); return new LettuceConnectionFactory(configuration, lettucePoolingClientConfiguration); } public static LettuceConnectionFactory createConnectionFactory(RedisClusterProperties properties) { RedisClusterConfiguration configuration = getRedisClusterConfiguration(properties.getAddresses()); if (StringUtils.hasLength(properties.getPassword())) { configuration.setPassword(properties.getPassword()); } GenericObjectPoolConfig genericObjectPoolConfig = getPool(properties); ClientOptions clientOptions = getClusterClientOptions(properties); ClientResources resources = getClientResources(properties); LettucePoolingClientConfiguration.LettucePoolingClientConfigurationBuilder builder = LettucePoolingClientConfiguration.builder().poolConfig(genericObjectPoolConfig).clientOptions(clientOptions).clientResources(resources); setupTimeouts(properties, builder); builder.readFrom(ReadFrom.valueOf(properties.getReadFrom())); LettucePoolingClientConfiguration lettucePoolingClientConfiguration = builder.build(); return new LettuceConnectionFactory(configuration, lettucePoolingClientConfiguration); } private static GenericObjectPoolConfig getPool(T connProps) { GenericObjectPoolConfig genericObjectPoolConfig = new GenericObjectPoolConfig(); genericObjectPoolConfig.setMaxIdle(connProps.getMaxIdleConnections()); genericObjectPoolConfig.setMinIdle(connProps.getMinIdleConnections()); genericObjectPoolConfig.setMaxTotal(connProps.getMaxTotalConnections()); return genericObjectPoolConfig; } private static void setupTimeouts(T timeoutProps, LettucePoolingClientConfiguration.LettucePoolingClientConfigurationBuilder builder) { builder.shutdownTimeout(Duration.ofMillis((long)timeoutProps.getShutdownTimeout())).commandTimeout(Duration.ofMillis((long)timeoutProps.getCommandTimeout())); } private static ClusterClientOptions getClusterClientOptions(T props) { ClusterTopologyRefreshOptions topologyRefreshOptions = ClusterTopologyRefreshOptions.builder().enableAdaptiveRefreshTrigger(new ClusterTopologyRefreshOptions.RefreshTrigger[]{RefreshTrigger.MOVED_REDIRECT, RefreshTrigger.PERSISTENT_RECONNECTS}).adaptiveRefreshTriggersTimeout(Duration.ofSeconds((long)props.getAdaptiveRefreshTriggersTimeoutInSec())).enablePeriodicRefresh(Duration.ofSeconds((long)props.getPeriodicRefreshIntervalInSec())).build(); return ClusterClientOptions.builder().topologyRefreshOptions(topologyRefreshOptions).disconnectedBehavior(DisconnectedBehavior.REJECT_COMMANDS).autoReconnect(true).build(); } private static ClientOptions getStandaloneClientOptions() { return ClientOptions.builder().disconnectedBehavior(DisconnectedBehavior.REJECT_COMMANDS).autoReconnect(true).build(); } private static ClientResources getClientResources(T props) { return DefaultClientResources.builder().computationThreadPoolSize(props.getComputationThreadPoolSize() == null ? Runtime.getRuntime().availableProcessors() : props.getComputationThreadPoolSize()).ioThreadPoolSize(props.getIoThreadPoolSize()).build(); } private static RedisClusterConfiguration getRedisClusterConfiguration(List addresses) { return new RedisClusterConfiguration(addresses); } private static RedisStandaloneConfiguration getRedisStandaloneConfiguration(String host, int port, String password, int dbIdx) { RedisStandaloneConfiguration configuration = new RedisStandaloneConfiguration(host, port); configuration.setPassword(password); configuration.setDatabase(dbIdx); return configuration; } } ```

Expected behavior/code

Environment

Possible Solution

Additional context

mp911de commented 1 year ago

When using Redis Cluster with MGET and keys across multiple slots, then Lettuce partitions the given keys across slots and issues multiple MGET calls to assemble the final response then.

These kinds of issues are hard to reproduce locally. You can find the code for key partitioning in RedisAdvancedClusterAsyncCommandsImpl.mget(…). I would ask you to run the code with a debugger attached to investigate at what point the key gets lost and whether you use duplicate keys (e.g. a key multiple times).

gauravgarg-dev commented 1 year ago

@mp911de we enabled pipelining to fetch the exact data pattern with single GET calls and combined to get the result. We are always getting the data. What could the reason for this behaviour with MGET ?

tishun commented 5 months ago

@mp911de we enabled pipelining to fetch the exact data pattern with single GET calls and combined to get the result. We are always getting the data. What could the reason for this behaviour with MGET ?

Are you still having this issue? Could you clarify what you mean by "we enabled pipelining to fetch the exact data pattern with single GET"?