joomla / joomla-cms

Home of the Joomla! Content Management System
https://www.joomla.org
GNU General Public License v2.0
4.74k stars 3.64k forks source link

Session don't close since 3.8.4 update #19585

Closed aegaudin closed 6 years ago

aegaudin commented 6 years ago

Since 3.8.4 update, Joomla sessions don't close : dozen of connected users are wrongly shown on frontend Community Builder Online module, and also shown backend admin panel in the last 5 connected users module.

Quy commented 6 years ago

Please test PR #19548.

mbabker commented 6 years ago

Sessions do properly expire.

A change was made with https://github.com/joomla/joomla-cms/pull/19165 to how the optional session metadata (which shows these online user counts in various places) is managed as a means to address performance issues with tracking it. Most specifically, our cleanup operation will not run at all anymore when the database session handler is in use as this conflicts with PHP's native session management (and we store both the "real" session data and the optional metadata in the same table). There was a bug with that PR, fixed in 3.8.5, that prevented the cleanup from running when a non-database handler was in use because the event handler was configured to a non-existing event and therefore never triggered.

See https://github.com/joomla/joomla-cms/issues/19511 and https://github.com/joomla/joomla-cms/issues/19521 for additional discussion, and https://github.com/joomla/joomla-cms/pull/19548 for a tool which can be used on servers to run this operation on an as needed basis (can be implemented as a cron job if desired). Note for now though that https://github.com/joomla/joomla-cms/pull/19548 actually only works consistently for the metadata portion of cleanup if using the database session handler, to implement a cron job to handle metadata cleanup in general other changes in core are needed which could land in 3.9 (https://github.com/joomla/joomla-cms/pull/19578 starts the work on some of the required infrastructure changes).

aegaudin commented 6 years ago

I missed the information - thanks a lot, changing to php handler worked fine. What a quick answer, you're amazing !

SniperSister commented 6 years ago

As we see increasing amounts of reports about that issue in the german-speaking support channels, I would like to question the current "won't fix" for this problem.

Yes, the system works as designed when session.gc_probability isn't set to 1 - but the reality is that the 3 largest webhosts in germany using 0 as their default value and therefore sessions aren't cleaned up anymore. This means that a previously working site is now experiencing issues, which makes it a breaking change for me. Using a cronjob isn't a solution imho, because it requires a user action to implement and numerous webhosts don't offer cronjobs at all.

Therefore I would suggest to restore the previous behavior if gc_probability is set to 0.

joomla-cms-bot commented 6 years ago

Set to "open" on behalf of @franz-wohlkoenig by The JTracker Application at issues.joomla.org/joomla-cms/19585

ghost commented 6 years ago

Reopened as stated above.


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/19585.

joeforjoomla commented 6 years ago

@SniperSister +1

mbabker commented 6 years ago

Therefore I would suggest to restore the previous behavior if gc_probability is set to 0.

No. Joomla should not be doing PHP's garbage collection job only for the database handler. Either we implement our own universal garbage collection handler and do it for all session handlers, or we continue on the path that we have started down so that it doesn't matter if gc_probability is set to 0 or 100. On one of these hosts that have a 0 probability, set your handler to the PHP filesystem handler and check how many files exist in the storage directory and how that gets cleaned up, if it does.

The optional metadata MUST be split to a separate database table. That is the data that we are concerned with managing. As long as the metadata is stored in the same table as the real session data, and as long as the database handler is coupled to our writing of this metadata, we cannot optimize the workflows managing these two sets of data and the end result is either an arbitrary check on every HTTP request which causes performance issues or where we are now where the table doesn't get cleaned at an adequate interval.

Just like I said elsewhere last night, we need to stop accepting quick fix bandaid solutions and actually fix the root issues in our architecture.

SniperSister commented 6 years ago

Ok, so simple question: how do we fix the current issues for the mentioned users (gc_probability = 0)? Using the CLI job isn't a solution, letting them switch the session handler? That's ok for me if that happens automatically at the next update and also purges the session table. Right now the system is just broken and leaving it like that and telling people that it's their host's fault can't be the solution.

mbabker commented 6 years ago

Come up with a way to migrate to this data schema in a B/C manner:

CREATE TABLE IF NOT EXISTS `#__session` (
  `session_id` varchar(191) NOT NULL DEFAULT '',
  `time` varchar(14) DEFAULT '',
  `data` mediumtext,
  PRIMARY KEY (`session_id`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 DEFAULT COLLATE=utf8mb4_unicode_ci;

CREATE TABLE IF NOT EXISTS `#__session_metadata` (
  `session_id` varchar(191) NOT NULL DEFAULT '' COMMENT 'FK to #__session',
  `client_id` tinyint(3) unsigned DEFAULT NULL,
  `guest` tinyint(4) unsigned DEFAULT 1,
  `time` varchar(14) DEFAULT '',
  `userid` int(11) DEFAULT 0,
  `username` varchar(150) DEFAULT '',
  KEY `session_id` (`session_id`),
  KEY `userid` (`userid`),
  KEY `time` (`time`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 DEFAULT COLLATE=utf8mb4_unicode_ci;

The problem with the 3.8.3 and earlier behavior is:

Joomla is inconsistent at best. The changes I am working on aim to bring consistency and fix architecture and performance issues.

SniperSister commented 6 years ago

And to be frank, users are only complaining because of all the UI features that make use of the metadata, which is fair for us to manage, would it really be commented on as much if the #__session table bloated to 50K rows if they didn't see it in their UI?

It's not only about "seeing" that in the UI, it quickly becomes a performance issue (I have a client site with "some" traffic, pilled up 600k rows in the session table causing a TTFB of 20 seconds in the backend) and a quota issue (mentioned table quickly grew to 2GB).

I fully agree that the old design was a pita and refactoring makes sense, but I don't think that this should have happend in a patch release but in J4, because it turns out to be a breaking change for a lot of users.

Gitjk commented 6 years ago

Just a comment - I found this discussion after searching for a reason why my Joomla 3.8.5 website database has grown from the usual 11 mb to almost 500 mb. Had to truncate the session table to shrink it. In Joomla configuration the session handler is set to 'database'. Never had that problem with previous versions.

mbabker commented 6 years ago

So what are you going to do about systems where gc_probability is set to 0 and the database handler is not in use? You seem to care about the database size, why not the size of your filesystem? Or your Memcached storage? Yes, the size of the table is a performance issue, yes the correct fix is to fix your PHP runtime configuration, no Joomla should not give preferential treatment to one out of our supported eight storage handlers (six in 4.0 after removing deprecated engine support).

If you want Joomla to do session garbage collection, it needs to be consistent for all the same reasons for all handlers. Or, revert the changes I made for 3.8.4, wire up a bot to automatically reject changing any code that ever touches anything related to session management in Joomla (the real data or the optional metadata, where all these performance problems exist), and accept the status quo that Joomla is garbage as it relates to session management, especially because of this metadata because these who's online modules are so important to some users.

And yes, I am very annoyed because I'm sick and tired of defending fixing architecture issues when everyone else is content with amateur hour shortsighted fixes that just work around the real problems.

SniperSister commented 6 years ago

Or, revert the changes I made for 3.8.4

That's exactly what I suggested. Not because I don't like the idea of a clean architecture, not because I want to piss you off, not because I don't appreciate all the work that you put in there - but because it's a breaking change in a patch release. Do I really need to explain that this is a bad thing?

joomdonation commented 6 years ago

We don't need to fully revert. The change from 2 to 5 is fine, we just need to make small change to that PR (remove the check for database handler and everyone would be happy)

mbabker commented 6 years ago

Fine, I will revert the &($* pull request and rescind all pull requests I have because clearly this community would rather have a broken CMS that displays a semi-accurate count of online users. What a joke.

csthomas commented 6 years ago

Or, revert the changes I made for 3.8.4

I do not agree. This change improves performance of website. For all other users we can add a simple plugin like "Garbage database session" by default enabled.

mbabker commented 6 years ago

https://github.com/joomla/joomla-cms/pull/19678

I'm done.

csthomas commented 6 years ago

@mbabker Is moving session garbage collection to plugins is a bad idea?

mbabker commented 6 years ago

I don't care anymore. People don't give a damn that garbage collection isn't happening for the non-database handlers, why should I? They don't care there's a major performance impact in running this one query over 100 times per minute, why should I?

They just want their who's online module to be right, give the people what they want, not what they need.

Gitjk commented 6 years ago

They just want their who's online module to be right, give the people what they want, not what they need.

In my case I don't use that module, but nevertheless encounter this issue. However, the problem seems to disappear if I disable the 'User Status' module.

mbabker commented 6 years ago

Sure, because you have one less UI element making a query into the session table.

Gitjk commented 6 years ago

I'm quite sure that I never ever touched that module's (default) settings before. :-)

mbabker commented 6 years ago

Here are the practical options:

Reverting to 3.8.3 behavior is a garbage answer, but the one everyone seems to want. Because they see a bloated DB table. Seriously, if your server has gc_probability at 0 then PHP is not cleaning out stale session data at all, regardless of the storage backend. You might not see it if there are other operations in place that clean up those storage spaces (maybe a system cron job to clear the directory that is storing filesystem based backend storage?), but the same problem exists. Because I changed the Joomla code to stop arbitrarily running garbage collection on the database table, this server configuration issue reared its ugly head. Looking at one of my WHM servers, gc_probability is 0 and checking the storage path for sessions on sites using filesystem storage on that server, I see no files older than a couple of hours ago. This means there is an external operator running that garbage collection operation because PHP clearly isn't.

A plugin which forces PHP garbage collection to run is a workaround to a broken server configuration. It might fix the issue users are seeing, but it's still crap because it is relying on an application to address a server level misconfiguration.

brianteeman commented 6 years ago

this is an old post but i strongly suspect that its still valid https://www.appnovation.com/blog/session-garbage-collection-php

The important line is

One thing you might not have noticed is that in the Debian/Ubuntu distro, by default PHP disables its session garbage collection mechanism

aDaneInSpain commented 6 years ago

Could this not all be solved by adding a new configuration setting:

Session garbage collection frequency:

And then use Joomla for garbage collection?

mbabker commented 6 years ago

That's the plugin approach mentioned above. It's still a workaround to a bad server config but would get the job done.

brianteeman commented 6 years ago

based on the article I linked to it is not "a bad server config"

mbabker commented 6 years ago

"bad server config" basically being "there is no garbage collection happening by way of PHP". Not necessarily that the config itself is bad.

brianteeman commented 6 years ago

if you read the article it explains that debian decided to use a cron job instead of gcprobability but thats no good for us.

https://www.drupal.org/node/160046

mbabker commented 6 years ago

Just to close the loop on this one. This is a screenshot of the session config on one of my Ubuntu 16.04 servers using the PHP 7.1 package (basically defaults), and the cron task that is set up to handle the garbage collection operation. (writing as you posted the cron job comment) So there is SOMETHING handling the garbage collection, but it's not PHP (for Ubuntu it's this cron job, on a WHM/cPanel server you've probably got an entry for /usr/local/cpanel/scripts/clean_user_php_sessions in your crontab, etc.).

screen shot 2018-02-14 at 9 06 42 am

mbabker commented 6 years ago

Here is the spec for a GC plugin, the internals of which are modeled around php_session_gc:

class PlgSystemSessionGc extends JPlugin
{
    public function onAfterRespond()
    {
        $probability = $this->params->get('gc_probability', 1);
        $divisor     = $this->params->get('gc_divisor', 1000);

        $random = $divisor * lcg_value();

        if ($probability > 0 && $random < $probability) {
            JFactory::getSession()->gc();
        }
    }
}
stAn47 commented 6 years ago

hello, this is another very important change which was hardly documented before the patch release.

we take care of all PHP file sessions by using custom CRON jobs and we do not expect Joomla do to anything about it since in Joomla, this is called "none" and it's not called xcache/apc/file or else... It is just "none" which means that session storage + medium encryption + SHM or directory access + any garbage or session cleaning is hanled by server administrators and not by the application.

Since joomla implements "custom" (user) session handler called "database" everybody would expect that it takes care of the session storage + encryption + serialization/deserialization + cleaning by itself.

there are PHP variables for "cookie path" and "cookie domain" and nobody expects them to be set by the server, but rather by the application itself (Joomla)

so what is the proper config (joomla or server) now for enviroments where "gcprobability" is set to zero (i.e. custom cleaning) and there are sites both using file and database session that share the same php.ini ? (i.e. 2 sites per inside a customer enviroment)

Session GC CLI is not really a solution since most hostings do not provide a secured CLI access to PHP and for an hosting company it would mean to locate all Joomla installations and execute such CLI file.

GC in the database could be also done:

mbabker commented 6 years ago

Since joomla implements "custom" (user) session handler called "database" everybody would expect that it takes care of the session storage + encryption + serialization/deserialization + cleaning by itself.

Not necessarily true. Our handler can do garbage collection operations, as long as it is told to do so. Clearly, there are PHP configurations (almost the default?) where PHP does not handle garbage collection at all and this cleaning is done by way of server level tasks (i.e. the pointed out cron jobs on Ubuntu or WHM configured servers). And again, this does not only affect the database handler, it is the most visible with the database handler for a plethora of reasons.

same as before (Joomla should take care of the session cleaing upon a probability ratio not from php.ini)

Again, not a viable option:

Again, the changes in 3.8.4 only exposed the fact that servers are not configured to run session garbage collection operations, and again at least in the case of filesystem stored sessions this is balanced by a cron job that is routinely set up in Linux servers to clear this storage path. We don't know what is in place for the other backend stores though, so continuing to give the database handler preferential treatment is a wrong practice too.

The above spec'd GC plugin:

Gitjk commented 6 years ago

Do I as one of the more unsuspecting Joomla users understand that correctly, that one (intermediate) solution could be to set session.gc_probability = 1 in the php.ini file?

mbabker commented 6 years ago

Yes.

Gitjk commented 6 years ago

Ok, thank you. :-)

mbabker commented 6 years ago

I'll tell you all what. In the next 18 hours, I will prepare one pull request which gets to the end state I am trying to achieve within the B/C limitations of 3.x for the session related code (both real storage and metadata). That is going to be a take it or leave it pull request because it is going to have a large quantity of changes in it (the code from https://github.com/joomla/joomla-cms/pull/19578, a full plugin implementation of https://github.com/joomla/joomla-cms/issues/19585#issuecomment-365645231, new config options in the application class to disable metadata cleanup from the HTTP request cycle (not exposed in the UI, users will have to direct change configuration.php to apply this, which if they are aiming to use cron job based cleanup they should be able to edit a PHP file in a text editor)).

What I am NOT going to do is remove the check on triggering the metadata cleanup in the application class so that it does not run for the database handler. With the session GC plugin, it will get cleaned up, but at a lesser frequency to the metadata cleanup operation than if you were using any other handler. Because I am moving the system to treat the metadata as a separate entity to the actual session data. And in 4.0 the two data types will be correctly differentiated in two different tables (requiring a B/C break in our data schema and inherently integrations which work with the metadata) so that there does not need to be a handler check in front of the metadata cleanup task.

csthomas commented 6 years ago

Can you put a cleaning metadata process in that plugin too? I mean move code from https://github.com/joomla/joomla-cms/blob/staging/libraries/src/Application/CMSApplication.php#L160 to the plugin.

  1. If we use database session then plugin clear all session table.
  2. If we use memcached,. etc then it clears only metadata.

The end user will not notice any problem, and an advanced user can change the probability at his own discretion.

joomla-cms-bot commented 6 years ago

Set to "closed" on behalf of @Quy by The JTracker Application at issues.joomla.org/joomla-cms/19585

Quy commented 6 years ago

See PR #19687


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/19585.

philip-sorokin commented 6 years ago

Excuse me. Why do we need to force a user to edit the php.ini? Instead, we can set a runtime directive:

ini_set('session.gc_probability', 1);

philip-sorokin commented 6 years ago

In Debian PHP distributives session.gc_probability will always be switched off: https://serverfault.com/questions/511609/why-does-debian-clean-php-sessions-with-a-cron-job-instead-of-using-phps-built/511705#511705

There is a conflict with permissions. Instead, a cron task is executed every 30 minutes: /etc/cron.d/php