phpredis / phpredis

A PHP extension for Redis
Other
9.98k stars 2.14k forks source link

Session handler does not honour lazy_write leading to unnecessary writes #2079

Closed zorro-fr24 closed 1 year ago

zorro-fr24 commented 2 years ago

Expected behaviour

When session.lazy_write is set to 1 (default), I would expect the session handler to perform a write/set only if data has been changed.

Actual behaviour

The phpredis session handler will ALWAYS perform a session write, even if nothing has been written to the session object

I'm seeing this behaviour on

OS: CentOS Linux 7 (Core) Kernel: 3.10.0-1127.el7.x86_64

Redis: redis_version:6.2.5 redis_mode:cluster os:Amazon ElastiCache

PHP: 7.3.33 phpredis: 5.3.7 ( 5.3.7-1.el7.remi.7.3 @remi-php73 )

php session configuration:

session.serialize_handler = msgpack
session.save_handler = rediscluster
session.save_path = "seed[]=redis-test-001:6379&timeout=10&read_timeout=10&failover=error&persistent=1&distribute=0&cache_slots=1"

Steps to reproduce, backtrace or example script

Using this test script:

<?php
session_id("pd4l6g7u9p8hme3vsn6pg7cc6dkhfk8qig5hdvifua5vepgh7k42");
session_start();

echo "Current Session:\n";
var_dump($_SESSION);

echo "session.lazy_write: " . ini_get('session.lazy_write');

Expected script output:

Current Session: array(0) { } session.lazy_write: 1

Despite the fact we do not write to the session, we still see empty SET commands reaching the cluster master:

redis-cli -c -h redist-test-001 -p 6379 MONITOR
OK
1646638918.459297 [0 10.1.10.58:38946] "get" "PHPREDIS_CLUSTER_SESSION:pd4l6g7u9p8hme3vsn6pg7cc6dkhfk8qig5hdvifua5vepgh7k42"
1646638918.464037 [0 10.1.10.58:38946] "set" "PHPREDIS_CLUSTER_SESSION:pd4l6g7u9p8hme3vsn6pg7cc6dkhfk8qig5hdvifua5vepgh7k42" "1440" "\x90"

On our platform, this equates to thousands of unnecessary writes per second

We can work around this on the application level using the following code example:

register_shutdown_function(function() {
    if (!session_been_modified()) {
    session_abort();
    }
});

which will abort the session if no modification has been made however this will require modifying many projects for us, and ideally we'd like to see the php lazy_write option to be implemented by session handler instead.

Let me know if you would like further tests or debugging info,

Cheers Joe @ flightradar24

I've checked

zorro-fr24 commented 2 years ago

note: I have verified on the same installation but with sessions in standard file mode, that the session files do not get rewritten when executing the same script multiple times

michael-grunder commented 2 years ago

Thanks for the report. I was unaware of this setting but will look into what might be required to honor it from PhpRedis.

zorro-fr24 commented 2 years ago

Thanks for taking a look. For pre-6 .2 I suspect you would need to fire EXPIRE instead to keep the session ttl fresh ( perhaps you don't need to bother waiting for the response though? ).

BUT with 6.2 you could use GETEX at the start to load the session instead, which should save an extra request. Not sure if this would be a violation of php session specification though, since I think it assumes PS_UPDATE_TIMESTAMP_FUNC should be called for lazy_writes

JakubOnderka commented 2 years ago

This should be already implemented by this commit: https://github.com/phpredis/phpredis/commit/aaaf0f233afe404bc1d8c6a600af860448ee0bd6. Or it doesn't work?

zorro-fr24 commented 2 years ago

@JakubOnderka

This should be already implemented by this commit: aaaf0f2. Or it doesn't work?

Not as far as I've seen!

Running the test case I posted at least, I see only SETs. I just retested with: php: 8.1.11 php_redis:5.3.7 redis: 7.0.5

and if I open the same session 3 times, without modifying the session object I see:

1666675881.318689 [0 83.140.21.105:37806] "GET" "PHPREDIS_CLUSTER_SESSION:pd4l6g7u9p8hme3vsn6pg7cc6dkhfk8qig5hdvifua5vepgh7k42"
1666675881.329607 [0 83.140.21.105:37806] "SETEX" "PHPREDIS_CLUSTER_SESSION:pd4l6g7u9p8hme3vsn6pg7cc6dkhfk8qig5hdvifua5vepgh7k42" "1440" "\x90"
1666675890.793380 [0 83.140.21.105:37806] "GET" "PHPREDIS_CLUSTER_SESSION:pd4l6g7u9p8hme3vsn6pg7cc6dkhfk8qig5hdvifua5vepgh7k42"
1666675890.801467 [0 83.140.21.105:37806] "SETEX" "PHPREDIS_CLUSTER_SESSION:pd4l6g7u9p8hme3vsn6pg7cc6dkhfk8qig5hdvifua5vepgh7k42" "1440" "\x90"
1666675891.609979 [0 83.140.21.105:37806] "GET" "PHPREDIS_CLUSTER_SESSION:pd4l6g7u9p8hme3vsn6pg7cc6dkhfk8qig5hdvifua5vepgh7k42"
1666675891.622140 [0 83.140.21.105:37806] "SETEX" "PHPREDIS_CLUSTER_SESSION:pd4l6g7u9p8hme3vsn6pg7cc6dkhfk8qig5hdvifua5vepgh7k42" "1440" "\x90"
zorro-fr24 commented 2 years ago

are we even storing session update time? if we look at the php session implementation:

if (PS(lazy_write) && PS(session_vars)
    && PS(mod)->s_update_timestamp
    && PS(mod)->s_update_timestamp != php_session_update_timestamp
    && zend_string_equals(val, PS(session_vars))
) {
    ret = PS(mod)->s_update_timestamp(&PS(mod_data), PS(id), val, PS(gc_maxlifetime));
    handler_function_name = handler_class_name != NULL ? "updateTimestamp" : "update_timestamp";
} else {
    ret = PS(mod)->s_write(&PS(mod_data), PS(id), val, PS(gc_maxlifetime));
    handler_function_name = "write";
}

what would be s_update_timestamp and php_session_update_timestamp in our case?

zorro-fr24 commented 1 year ago

@JakubOnderka @michael-grunder looks like lazy write support was only enabled for standalone redis, and not cluster. I have an initial pass at adding that functionality, plus some other optimisations.

zorro-fr24 commented 1 year ago

To outline the change in behaviour. With the current develop build every request with a session will result in a command sequence like this, even if the session is unmodified:

REQ  "GET" "PHPREDIS_CLUSTER_SESSION:b78qgrtp259a97q5ivocthn7e6"
RESP "M-^BM-@^AM-&lipsumM-Z^AM-^\Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam maximus viverra nibh. Maecenas lacinia viverra velit sodales rutrum. Curabitur consectetur eleifend orci, nec facilisis diam bibendum at. Vestibulum eget arcu in eros mollis aliquet nec id leo. Ut vulputate ipsum nisi, sit amet malesuada ligula malesuada eget. Vivamus at diam cursus, tincidunt odio non, tempus velit. Vestibulum eu fringilla quam."

REQ  "SETEX" "PHPREDIS_CLUSTER_SESSION:b78qgrtp259a97q5ivocthn7e6" "1440" "M-^BM-@^AM-&lipsumM-Z^AM-^\Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam maximus viverra nibh. Maecenas lacinia viverra velit sodales rutrum. Curabitur consectetur eleifend orci, nec facilisis diam bibendum at. Vestibulum eget arcu in eros mollis aliquet nec id leo. Ut vulputate ipsum nisi, sit amet malesuada ligula malesuada eget. Vivamus at diam cursus, tincidunt odio non, tempus velit. Vestibulum eu fringilla quam."
RESP "OK"

with the patch:

REQ  "GETEX" "PHPREDIS_CLUSTER_SESSION:b78qgrtp259a97q5ivocthn7e6" "EX" "1440"
RESP "M-^BM-@^AM-&lipsumM-Z^AM-^\Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam maximus viverra nibh. Maecenas lacinia viverra velit sodales rutrum. Curabitur consectetur eleifend orci, nec facilisis diam bibendum at. Vestibulum eget arcu in eros mollis aliquet nec id leo. Ut vulputate ipsum nisi, sit amet malesuada ligula malesuada eget. Vivamus at diam cursus, tincidunt odio non, tempus velit. Vestibulum eu fringilla quam."

Which is 1 round-trip instead of 2 and about half the payload size.

The problem is that GETEX requires 6.2 which is why I place it behind a config flag ( also because the behaviour has an impact on session ttl for long running scripts ). The alternative, if we want to support early versions in all cases is to do something like this:

`REQ  "GET" "PHPREDIS_CLUSTER_SESSION:b78qgrtp259a97q5ivocthn7e6"
RESP "M-^BM-@^AM-&lipsumM-Z^AM-^\Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam maximus viverra nibh. Maecenas lacinia viverra velit sodales rutrum. Curabitur consectetur eleifend orci, nec facilisis diam bibendum at. Vestibulum eget arcu in eros mollis aliquet nec id leo. Ut vulputate ipsum nisi, sit amet malesuada ligula malesuada eget. Vivamus at diam cursus, tincidunt odio non, tempus velit. Vestibulum eu fringilla quam."

REQ  "EXPIRE" "PHPREDIS_CLUSTER_SESSION:b78qgrtp259a97q5ivocthn7e6" "1440"
RESP "1"

Which is the patch behaviour when "early_refresh" is disabled.

As I suspect you may have reservations about adding server version dependent options, I will skip making a PR here prior to feedback, and see whether it's worth adding this for standalone redis sessions too

zorro-fr24 commented 1 year ago

I've had some time to measure performance in production now.

We're seeing a 14x reduction in bytes sent to the server for mixed workload endpoints ( received bytes remains about the same ), and about 31x reduction on readonly endpoints and about 46% fewer commands

bitactive commented 7 months ago

This patch has been merged for RedisCluster code, but analogous code has not been added for the standard Redis session handler. Is this an oversight or a deliberate action? In particular the code related to early_refresh.

zorro-fr24 commented 6 months ago

@bitactive hey, if I recall, I was mainly focused on cluster support for lazy_write which was already supported in standalone mode. Early refresh was just an optimisation I noticed while I was digging around in the cluster code so I didn't think to port it over to other modes. I see you've taken care of it, so thanks!