redis / redis-py

Redis Python client
MIT License
12.4k stars 2.48k forks source link

[#3090] Reduce Command Responses in Redis Connection Process (on_connect) #3268

Open zeze1004 opened 3 weeks ago

zeze1004 commented 3 weeks ago

Pull Request check-list

3090

Please make sure to review and check all of these items:

NOTE: these things are not required to open a PR and can be done afterwards / while the PR is open.

Description of change

Please provide a description of the change here. In this PR, I experimented with using MULTI and EXEC commands to batch multiple Redis commands into a single request. My goal was to reduce the number of network round-trips and improve overall performance by minimizing network latency.

Observations After running the integration tests, I noticed that while the number of network requests decreased, the total execution time actually increased. Here’s what I found:

  1. Increased BufferedReader Time

    When we send multiple commands in one MULTI/EXEC block, the server responds with all the results in one go. Reading this large combined response takes longer, which increased the time spent in the BufferedReader.

  2. Command Packing Overhead

    Packing multiple commands into a single MULTI request requires additional processing to format the data correctly. This added some overhead to the command preparation phase.

  3. Complex Response Parsing

    Parsing the combined response from EXEC also turned out to be more complex and time-consuming. Each individual command’s result had to be handled separately from the large, single response, which added to the total processing time.

Test Result Here are the Integration test results comparing the Original Logic' and the Modified Logic

[ Measuring the time for 1000 Redis connections ] Modified Logic Original Logic
0.243718 s 0.168276 s
class TestRedisIntegration(unittest.TestCase):

    def setUp(self):
        self.client = redis.Redis(host='localhost', port=6379, db=0)
        self.client.set('test_key', 'test_value')
        logger.info(f"value: {self.client.get('test_key').decode('utf-8')}")
        self.client.flushdb()

    def tearDown(self):
        self.client.flushdb()

    def test_on_connect_performance_high_version(self):
        """Test on_connect method performance"""
        conn = Connection()
        conn.username = None
        conn.password = None
        conn.protocol = 2
        conn.client_name = None
        conn.lib_name = 'redis-py'
        conn.lib_version = '99.99.99'
        conn.db = 0
        conn.client_cache = False

        # Recording start time and measuring execution time
        connect_time_thousand = timeit.timeit(conn.on_connect, number=1000)

        logger.info(f"Measuring the time for 1000 Redis connections: {connect_time_thousand:.6f} seconds")

image

Conclusion While the idea was to reduce network latency by batching commands, the extra time taken to read and parse the larger response offset these gains. It seems that for our case, the increased local processing outweighed the benefits of fewer network requests.


I’d love to get your feedback on this. Do you think there are other optimizations we should consider, or is there something I might have missed? Any insights would be greatly appreciated! cc. @chayim 🙇🏻

Thanks!